I can't help but read all the interjections from the core developers as a strong indication of why all these sorts of things tend to fail on the vine. From Unladen Swallow onward there have been these groups off doing interesting and awesome experiments to try and make Python faster, and they never actually make it into something that Python end-users can use (yes, I know US had shortcomings).
This is the work of one (admittedly super smart) guy over two years. Just like Nuitka, something else that was supposed to be impossible but just keeps making steady progress. Maybe I'm just seeing design by committee?
The lone geniuses can only go so far. Maybe the problem is that Python can't quite decide what it wants to be because it's too many things for too many people already.
Is the concept of Python the language, as opposed to Python the ecosystem, valuable enough so that a Python that broke backwards compatibility with all the C extensions would be useful as its own multicore-capable runtime?
PyPy seemed to think so for a while and now has gone hard in the other direction, reimplementing (faking?) a bunch of the CPython extension API so maybe this approach would never work. I don't know, but seeing things like:
> there is a large number of “dark matter” Python (and C extension) code out there that isn’t open-source. We need to be careful not to break it since it might not be feasible for its users to make required changes, or to report problems back upstream to us. In particular, some C extensions protect their own internal state with the GIL. This is a big worry, and might be a big hindrance to adoption of a GIL-free Python.
really make me wonder if the community as a whole would conclude the same on these critical sorts of decisions which shape the future of the language if they were put forward and not just made by a couple people in a closed meeting.
Would you prefer to support some weird arbitrary nameless closed source extensions, or have a multicore Python? This obviously depends on who you are and what you're doing, which leads us back to Python being too much for too many, but even here we can get a feeling for how many people do what with the language.
> Would you prefer to support some weird arbitrary nameless closed source extensions, or have a multicore Python?
There's nothing wrong with staying on an older LTS version of Python. Let the people with the nameless closed-source stuff stick with that. The beauty of open source is that they can fork the older, GIL-ful version of Python and maintain it, if they like.
Multicore would be a tremendous boon to the language.
I’m one of them (though once I had to pay the tax, I tried to forget about it).
My chief complaint was that there was a decent syntactic change for 0 benefit for me, and many Python users. F-strings, swapping str<->unicode, print function. All white superficial stuff, at least as far as my domain is concerned (data science).
It felt like “hey other languages are getting breaking changes, we should too”.
This is completely different. Single core speeds have not increased for years (decades?), any language with performance vaguely on the list must have an answer to multicore computation. I’d put up with a fair amount of pain for this.
Perhaps, dunno, web devs would complain that this change doesn’t help them, and is only a pain. That’s what I disliked in 2->3. I was told that I’m a dinosaur and should put up and shut up. Which eventually I did. But this is my answer to the naysayers this time.
Of course this might still fail in technical grounds but I’m hopeful, sounds solid.
Is there some particular reason to believe the Python team is even able to get good insight into regressions in the long tail of python packages?
Also it's important to remember that a lot of of material contributions to the community (either to the foundation, via jobs, or even open-sourcing part of their internal stack) might be coming from closed source in some way. It's not wise to ignore that & I think the core team is rightfully cognizant of needing to balance that (balance - not tip to one extreme or the other).
I'm astounded that a change which will release untold heisenbugs into the wild is being considered. It changes my view of Python. In terms of inducing subtle, silent breakage in existing code, it reminds me of this horrifying change from PHP 8:
> it reminds me of this horrifying change from PHP 8
The only thing horrifying to me is that that behaviour was there in the first place!
I would be 100% behind those fixes, I assume from your response that you would not do them in the sake of backwards compatibility.
What would be your solution? To always have these idiosyncrasies in the language, or did you have a problem with how the fixes were implemented or rolled out?
> Concurrency is impossible to prove sound without language-level guarantees.
Not sure I understand that. The proposal is not to simply get rid of the GIL, but to have a two-tier mechanism that ensures correctness with all the C source that uses the macros it should use and doesn’t mess with refcounts behind Python’s back (doing sketchy stuff usually ends up in pain)
> I can't help but read all the interjections from the core developers as a strong indication of why all these sorts of things tend to fail on the vine.
I think this might be a misunderstanding of the nature of the event that these notes are generated from, unless I'm misunderstanding your objection. The point of this Q&A as I saw it was to explore the feasibility of the idea and fully flesh out the costs and benefits so that we can make informed decisions about how to proceed.
The "random interjections" are notes of caution about what trade-offs need to be made. For example, it is very easy to overlook "dark matter" code because we don't have access to it, but it's almost certainly the majority of Python code out there. It is also not a complete deal-breaker to say that some change could break unknown proprietary extensions — otherwise we'd never be able to change anything; the key is that the changes have to be worth it. A lot of that depends on details — if it's easy to update C extensions for nogil mode (even if they were designed without parallelism in mind), then making breaking changes to remove the GIL might not be so bad. If nogil mode requires that most C extensions totally overhaul their reference counting and C API usage and the changes require restructuring code rather than something that can be done with automated search and replace, that's a much bigger cost and will probably come with a long term fork of the ecosystem (which is a huge pain to deal with) and it might not be worth it.
Avoiding this sort of criticism will not make the underlying problems go away, and I think everyone involved understood that this meeting was intended to bring to light any objections that might guide the work towards ultimate resolution.
I don't think it's very coherent to criticise the team because "Python can't quite decide what it wants to be", but also criticise them for not adopting all these crazy cool changes that would fundamentally change Python.
Taking a highly conservative approach to breaking changes is absolutely not the same thing as being indecisive. The Python team has learned from experience how disruptive breaking changes can be.
> really make me wonder if the community as a whole would conclude the same on these critical sorts of decisions which shape the future of the language if they were put forward and not just made by a couple people in a closed meeting.
I for one couldn't care less if some proprietary binaries fail on Python 3.11 or so. That's why we keep multiple versions around (at last company, I could only use up to 3.6 because that was the version in the Sacred CentOS AMI)
And, of course, a very critical piece of code was depending on a bug in regex that was fixed in 3.8 or so, and decided to break during a demo (where I was using 3.9 instead of 3.6).
I agree. If you're doing something so special in your code that nobody is publicly doing it and you're unable to update the code to make it compatible, then don't upgrade. That's the risk you took when coding it. We shouldn't go out on a limb for hypotheticals that cannot be confirmed. Force them to demonstrate an example publicly so all the cards are on the table. Proprietary code is fine and you can still share concepts by extracting out the issues.
Sorry to hear about your demo. Sounds not fun! At least that's in the past :)
> there is a large number of “dark matter” Python (and C extension) code out there that isn’t open-source. We need to be careful not to break it since it might not be feasible for its users to make required changes, or to report problems back upstream to us. In particular, some C extensions protect their own internal state with the GIL. This is a big worry, and might be a big hindrance to adoption of a GIL-free Python.
Probably doesn't work across minor versions anyway, most stuff isn't built against the limited API.
I think there's a pretty good chance this stuff gets incorporated:
"On a personal level, we are impressed by Sam’s work so far and invited him to join the CPython project. I’m happy to report he is interested, and to help him ramp up to become a core developer, I will be mentoring him. Guido and Neil Schemenauer will help me review code for the interpreter bits I’m unfamiliar with."
12 references to people in one statement, 5 referring to the post author, 1 reference to social fraternity membership, 1 statement of authority.
I'm not sure if there is a common name for this particular source of discomfort, but that quote definitely contains a lot of it. I'm a historical contributor to the Python source repository, but something about the social structure of the project has changed significantly in recent years that would dissuade me from submitting changes in future. The focus in the statement above no longer feels like it is on the actual productive output of the project itself, and in previous years it wasn't like that, nor needed to be like that.
Reminds me of something like the minutes of a professional schmoozer's business lunch, rather than a technical meeting, or something like that. If you have ever seen a stray engineer at an event like this (or had the misfortune of being that engineer), this feeling probably captures the problem well. Whatever it is, I'd love to see less of it.
> The focus in the statement above no longer feels like it is on the actual productive output of the project itself, and in previous years it wasn't like that, nor needed to be like that.
I don’t believe you. Python was never like this in the past, and has a long history of rejecting objectively good performance enhancements because the BDFL and friends wanted the implementation to remain a simple teaching example for students.
I have been a member of the Python community since 2002 and followed the lists for just as long, not least because of the quality of technical discussion up until the start of the 3.x saga.
I don't think anyone would deny Python has changed since then, I think most notably in the years following the release of Django, Python becoming a go-to web language, and the size of the community exploding.
Without knowing anything about how python is being developed, but I know that the python 2 -> python 3 transition was nearly fatal. And from other projects I know that the easiest way to prevent such events is that you put people in charge. Such a fundamental change as getting rid of the GIL may well trigger a similar situation if not done well (and maybe even if done well). It is the choice to manage the current limited status quo vs risking the future of the project on a bet that an improvement can be made without bifurcating the community again.
I find it harsh to suggest that the names in that statement didn't do real work. They've been at it for decades. And the newcommer report is outstanding in quality.
I'm rather interested in why you feel that way however. Do you think your work, or the work of some specific project/person hasn't received the recognition it deserves ? Or that the attention of the community diverted from important work toward subjects that are more eye catching ?
You are wrongly assuming that the core developers are a club that is putting up barriers in order to stay exclusive.
Much to the opposite, just like many other big and important open source projects, they are actively trying to recruit, mentor and urge people to come as far "in" as possible. It takes a lot of work and dedication to get acquainted with a code base.
Core developer means that you work on the core, not that you are a person the “python community” can't do without. If one is going to replace the GIL, a very core feature, then it makes sense to make that person a core member of the python squad.
I'm not an American and so I have no sense of college frats. I'm thinking of Fraternal Orders which are very explicitly male-only organisations. The word fraternity literally means "brotherhood".
The choice of the word "fraternity" by the poster above was deliberate and seems to be being used to imply a more severe exclusion of others.
I've understood it the same way. We have fraternities and sororities. A gender-neutral term would be a social club.
However, I've just looked it up in Merriam Webster, and it looks like this term can include female members, too, even thought the word and the related adjective have strong masculine associations. Wikipedia basically says the same, "Although membership in fraternities was and mostly still is limited to men, ever since the development of orders of Catholic sisters and nuns in the Middle Ages and henceforth, this is not always the case. There are mixed male and female orders, as well as wholly female religious orders and societies, some of which are known as sororities in North America."
okay, then you’re thinking of Fraternal Orders, which is also something different. I’m aware of what “fraternity” means etymologically, but I’m sure you’re aware that etymology is separate from meaning.
We don't even have to argue anyway. From a dictionary:
> [treated as singular or plural] a group of people sharing a common profession or interests: e.g. “members of the hunting fraternity”.
Contributors can be extinguished after they have done their duty. One of the most famous contributors ever spoke about "minibosses" the last time he was seen dealing with the Python core bureaucrats.
Can you give us more details about this ? Do you have a particular story in mind ? It seems a lot of fresh blood is comming to the python project recently, so I'm interested in what you have to say.
But why are you using a throwaway account ? And that you that have been answering the other comments with also newly created accounts ?
I can share the sentiment that PSF and the core team has become something like “old boys club”, and no, new core developers didn’t change this ambience you get when you read their forums/mailing lists and twitters (better avoid twitters though).
I don’t have anything about a community being an old boys’ club (would be nice to be in one some day, I reckon it’s what older people’s equivalent of a “safe space” is), it’s that they try hard to pretend they are something else that irks me.
But for the life of me I couldn’t find “python” and “minibosses” mentioned in the same document on the internet, except for an integration testing package and a computer game. So I think the story alluded to is bogus.
There’s a minuscule likelihood of it all being a twitter drama with key characters then nuking their tweets or deleting their accounts, but if it was of any significance some blog would have echoed it.
Just came in here briefly to opine that there is a very real risk of fork if the Python core community does not at least offer a viable alternative expediently.
The economic pressures surrounding the benefits of gross’s changes will likely influence this more than any tears shed over subtle backwards incompatibility.
I believe it was Dropbox that famously released their own private internal Python build a while back and included some concurrency patches.
Many teams might go the route of working from Sam Gross’ work and if we see subtle changes in underlying runtime concurrency semantics or something else backwards incompatible that’s it- either that adoption will roll downhill to a new standard or Python core will have to answer with a suitable GIL-less alternative.
I for one do not want to think about “ANSI Python” runtimes or give the MSFTs etc of the world an opening to divide the user base.
I mean, PyPy is over a decade old now, and micropython is a mere 7 years old. What's another fork? If anything, I strongly prefer languages that have more than one implementation.
MicroPython isn't really an alternative implementation of Python so much as it is an embedded scripting language that looks pretty much identical to normal Python.
However, I would speculate part of PSF's hesitancy is likely specifically around the perceived violence that gross' "GIL-less" changes may incur to the runtime semantics' backwards compatibility.
PSF in particular has a responsibility here as well I feel in that CPython is arguably the working spec or standard from which these other implementations work and are defined.
Do you also strongly prefer languages with different underlying concurrency semantics? While stackless and pypy etc. are around and available and this could suggest the answer could be "yes" we've been lucky that they haven't fundamentally changed the experience of writing Python.
The possibility that a ton of libraries might now be able to use efficient multi-threaded execution where they were previously constrained to multiprocessing will be a landslide of changes on its own, and likely reminiscent of python 2 -> 3 compatibility if we have to preserve "two ways of doing things."
I think it’s important to ask “what guarantees are in place to keep multiple implementations consistent and what is the fallout if they are not” as a lens to determining the degree to which it is or isn’t “world-ending”
The Python ecosystem once fell apart not too long ago because it was supporting two versions where one moved from “print “ to “print(“ and these were incompatible and broke things such as doc tests.
There’s a reason that ppl strongly started advocating to hard pivot to 3: there was a very real chance that Python 2 could fork the ecosystem.
Incompatible concurrency semantics would be a much worse can of worms.
> If anything, I strongly prefer languages that have more than one implementation.
What popular languages fall into this category ? I can only think of C/C++ and JavaScript - both seem like terrible examples of languages that took forever to evolve (people still compile down JS to ES5). I'm not sure what the Java story is but I would argue it has been terrible at evolving the language as well.
I much prefer languages that have one implementation as a de facto standard, worked on by core team (eg. C#, Rust, TypeScript). Sure they might be a few random implementations - but the language is basically what the main compiler supports. Standards and specifications add so much overhead and I really don't see the value.
> summary: Metapackage to select pypy as python implementation
Pyodide (JupyterLite) compiles CPython to WASM (or LLVM IR?) with LLVM/emscripten IIRC. Hopefully there's a clear way to implement the new GIL-less multithreading support with Web Workers in WASM, too?
The https://rapids.ai/ org has a bunch a fast Python for HPC and Cloud; with Dask and pick a scheduler. Less process overhead and less need for interprocess locking of memory handles that transgress contexts due to a new GIL removal approach would be even faster than debuggable one process per core Python.
Unladen Swallow is a 20% project. Besides, LLVM's JIT turns out to be an disappointment and I believe that no modern interpreted language uses LLVM's JIT at this moment.
Julia isn't a normal jit either. most jits make speculative guesses based on observed patterns and then has to deoptimize occasionally. Julia only compiles based on type information, so in many ways it is closer to running an ahead of time compiler at runtime. because of this, Julia is often called just ahead of time compiled.
It's not quite interpreted. Python is compiled to bytecode when first loaded (that's what all those .pyc files are) and there is no separate compile step as with Java.
It might not matter much if Canonical or IBM decided to port a critical mass of open source extensions/packages. Then they could ship the new CPython in place of the old one and mention the differences in the release notes. With one or both throwing their weight behind it, it would gain significant momentum above and beyond the original project.
> It might not matter much if Canonical or IBM decided to port a critical mass of open source extensions/packages. Then they could ship the new CPython in place of the old one and mention the differences in the release notes.
The lifeblood of Canonical and Red Hat / IBM is long term platform support for companies that want their most critical code to not break underneath them.
Even if the open source libraries get ported, there are still plenty of proprietary C extensions out there for which this would be a breaking change - the "dark matter" referred to in this post.
It would make zero sense for them to "throw their weight around" and _unilaterally_ break their customers' code if even the upstream devs didn't want to go through with the changes.
There experimental forks don't aim to change language semantics so they are quite safe from fragmentation pov even if they accidentally get some adoption. But they have so far been explicit about being research and uninterested in anything else.
These are exceptionally clear notes. They're easy to read and feel comprehensive. I also note that the author is the current CPython developer in residence (a recently created position).
I am surprised that closed source is suddenly an issue when it comes to the GIL, but half the world breaking on the python 3 transition was not only intended but actively pushed by various members of the community. Since Linux managed to get rid of the BLK then python should be able to get rid of the GIL.
They said they would make the non-GIL version opt in (command line flag?) so they wouldn't be breaking the old stuff anyway. It's a solved problem. If anyone moving to a new version of python can't take the time to understand such a small change, then that's on them.
Uber Lyft Airbnb most likely (this generally is shorthand for public unicorns and similar well respected major companies that contribute to open source).
Suppose a Python module written in C registers a method that ends up making a call to C functions larry(), moe(), and then curly() to mutate a global variable "global_mutable_temp" before finally returning a value generated from global_mutable_temp.
1. Supposing this method doesn't currently crash under GIL python, would it be true that this method will also run without crashing on the non-GIL python interpreter?
2. Would it be true that the non-GIL python interpreter will introduce a race to this method (resulting in a runtime error) that didn't exist under the GIL interpreter?
Why is that relevant? Many of the people contributing significantly to CPython also have full time jobs elsewhere. Sometimes their full time job overlaps with their contributions and sometimes it does not. For example Guido works for Microsoft and is working on CPython performance there, does that mean all of his work needs an asterisk saying it's actually a Microsoft corporate initiative?
PyTorch seems to obfuscate its Facebook ownership in general. At least I could find no mention of it on their "About" page or in the documentation, where the only mention of "Facebook" is a link in the footer to the PyTorch project's own social media page: https://pytorch.org/features/
Any component he soloed while on company time, yes, I would consider that an $employer project (not product, the work is donated under the PSF license and a CLA). We’re talking about ~2 years of paid full time work here.
I think switching the version to 4 is the most viable path. Make the last GILed python 3 an LTS release that interested parties can hold on to, eg. Ubuntu can keep python 3 as a default for a long time. One can always use conda to run multiple versions simultaneously.
And yet the popularity of Python exploded during the 2 to 3 transition. As someone who has migrated a few codebases from 2 to 3, I will gladly do it again to 4 if that brings a GILless Python.
Exactly. The 10% of python programmers who used and loved python 2, keep talking about the community “almost dying” at a time when 90% of it joined with a clear preference for python 3. It’s ridicules.
Maybe not, but as someone in the VFX/CG industry where Python 3 is only just being moved to this year (still on 2.7 mostly) because we didn't really need unicode, getting rid of the GIL in 3.0 would have very likely made us jump to Python 3 almost instantly years ago.
Besides the internal politics, I think the greatest blocker with improving python performance is not giving whomever is running the code any control. If I don't use any naughty C plugins, or if I can assure you that I've annotated all my types and can allow a large class of optimizations, why can't I run my code with some VM flags?
It's one language, but why can't I tune my VM to my needs? I can't imagine Java not letting users tune their GC.
I wonder if a midpoint for this sort of work would be if a major distro or several declared they would move to GIL-free Python?
System python at least is generally only recommended to be used for system libs, and that's a relatively supportable set. Developers use virtualenv's and their own specific interpreter, but it would certainly move the needle on what language people were by default scripting and thinking in.
This here: "The GIL will still be optionally available as an interpreter startup-time option" seems like a midpoint. Maybe it will even be GIL-by-default for some versions.
> What’s the level of perceived risk that the nogil project will end up not being viable for inclusion in CPython?
(...)
> It all depends on how well the community adapts C extensions so they don’t cause downright crashes of the interpreter. Then, the remaining long tail is community adopting free threads in their applications in a way that is both correct and scales well. Those two are the biggest challenges but we have to be optimistic.
Even if it's 10% of the mess the path py2->py3 was, it still worries me. I hope I'm wrong and it's much less than that (for the fatal cases ATL, and similar/non improved perf for the rest)
What python commitee considered as infeasible was almost done by a lone hero. Since the previous decision to change the format of the print function (that no one asked for) broke everybody's code for no reason and took ten years to be adopted, they will not push (for the one change everyone wants) for the foreseable future. Although it does not seem to be that impossible after all.
I am glad they will invite Sam, the lone gero we needed and hope he will be given some ownership of the task and not get him swamped with commiteeisms through a embrace, not extend and extinguish. He is on a success path, the commitee is in no path at all.
Just put a timeline and call it a fail if don't succeed and quit avoiding it through "discussions". We got it, it's not planned for the X.XX+1 version, each version.
Python is used by a lot of "semi-technical" people (data scientists, researchers, hobbyists, etc). Removing the GIL isn't going to make their life any easier.
The lone geniuses can only go so far. Maybe the problem is that Python can't quite decide what it wants to be because it's too many things for too many people already.
Is the concept of Python the language, as opposed to Python the ecosystem, valuable enough so that a Python that broke backwards compatibility with all the C extensions would be useful as its own multicore-capable runtime? PyPy seemed to think so for a while and now has gone hard in the other direction, reimplementing (faking?) a bunch of the CPython extension API so maybe this approach would never work. I don't know, but seeing things like:
> there is a large number of “dark matter” Python (and C extension) code out there that isn’t open-source. We need to be careful not to break it since it might not be feasible for its users to make required changes, or to report problems back upstream to us. In particular, some C extensions protect their own internal state with the GIL. This is a big worry, and might be a big hindrance to adoption of a GIL-free Python.
really make me wonder if the community as a whole would conclude the same on these critical sorts of decisions which shape the future of the language if they were put forward and not just made by a couple people in a closed meeting.
Would you prefer to support some weird arbitrary nameless closed source extensions, or have a multicore Python? This obviously depends on who you are and what you're doing, which leads us back to Python being too much for too many, but even here we can get a feeling for how many people do what with the language.