Shouldn't you omit `-s` in the last one? I think you're just effectively testing an empty loop, since you're only actually creating the list in the setup phase.
On my machine, the tests that actually do work are:
$ python3 -m timeit -s 't = (1, 2, "a")' 'list(t)'
5000000 loops, best of 5: 44.8 nsec per loop
$ python3 -m timeit -s 't = (1, 2, "a")' '[*t]'
10000000 loops, best of 5: 22.9 nsec per loop
$ python3 -m timeit '[1, 2, "a"]'
10000000 loops, best of 5: 23.1 nsec per loop
...compared to the empty ones:
$ python3 -m timeit -s '[1, 2, "a"]'
50000000 loops, best of 5: 5.43 nsec per loop
$ python3 -m timeit ''
50000000 loops, best of 5: 5.8 nsec per loop
You’re completely right, I missed stripping it, my bad (and from the disassembly below we can see that the second and third do about the same operation so they should have very similar if not identical performances, I should have caught that).
`-s` stands for "setup", the building of the tuple is only done once, and it is not part of the benchmark.
All three versions only bench the construction of the list, two of them from a tuple, while the third is a list literal. However if you plug it into `dis` you'll see that it compiles to loading a const tuple and creating a list from that
I don't think this will refute the article, but I would have found it more convincing if the benchmark had included a single setitem assignment as well, to be sure that the difference wasn't python doing a lazy dict-or-set type assignment on {}
I've also had this thought, but found that inspecting the type shows its by default a dictionary, and that it only is interpreted as a set if you treat it as such (eg add comma-seperated elements when instantiating)
The assignment operator `=` can never appear in a valid Python expression, so `{a=3, b=2}` should be distinguishable from `{a: 3, b: 2}`. (So does JS, where `{[a]: 3, [b]: 2}` would evaluate a and b.)
If Python had had built-in set notation from the start, {} might indeed have been a notation for an empty set instead of an empty dict.
However, Python didn't even have sets at all until version 2.3, and they were in a stdlib module instead of being a built-in type until version 2.6. By that time dict notation was well entrenched.
this makes it easy to move fields between `defaults` and `kwargs`. Granted, you could achieve the same isomorphism with ** operator, but less readably IMO.
I still miss coffeescript's ability to do dict(@name, @age) though I understand that it creates an unwelcome coupling between the parameter names and their names in the calling scope.
For my part, those names are often the same anyway, though, since the calling scope names are often arbitrary and might as well match the parameter names.
dict() can be useful for enforcing (or indicating to the reader) that keys are strings and valid Python identifiers. {} allows arbitrary keys, as long as the data type is hashable.
I think it's more readable. It matches with the type name, it is consistent when you define different structures (list, set, dict vs [], set, {}), you can pass it as a function, etc. [] and {} is unnecessary syntax sugar, not to mention how the same parentheses are overloaded when sets are defined.
there was some other place I can't recall also noticing that (at least my perception understands that) Python is trying to kill my code golf tendencies.
That said, it's probably just my own superstition and of course it doesn't remove my premature optimization tendencies...
If you want to implement the suggested changes across your codebase, you can use the `flake8` plugin `flake8-comprehensions`, or `ruff`'s `C4` ruleset, which is an implementation of `flake8-comprehensions`.
I guess I see this post as more about profiling, discovering unintuitive differences, and exploring python internals than offering 20ns optimizations. I personally appreciated it.
Bring globals into local scope, never use '.' in tight loops, avoid function calls, etc [1], sure make ugly python code. But, wow, it really can make a difference.
You have to compare that to the amount of energy spent while making it 0.1% faster. On average, I'm not sure we'd be saving any energy. The one bit that would get any remotely relevant gain would be offset by lots of tools where that improvement happens once a week.
How many minutes / how much energy would it take you to setup a project, run a profiler, do the fix, run tests/CI, commit, release, etc. -vs- how much energy would that 0.1% change save over years?
Yeah I would be much more concerned about things which cause exponential time increases, like developers who brute force search lists instead of using dictionaries than how they declare their dictionaries unless I was at the point of profiling and the dict() constructor was somehow near the top of the list.
Damb am I happy you commented this! I saw the same, checked it twice, and a final time before posting. But then the results were suddenly as expected (dict() being slower). At least I am not going mad, he must have changed it just now.
From Alex Martelli , a prominent Python expert [1]:
> I'm one of those who prefers words to punctuation -- it's one of the reasons I've picked Python over Perl, for example. "Life is better without braces" (an old Python motto which went on a T-shirt with a cartoon of a smiling teenager;-), after all (originally intended to refer to braces vs indentation for grouping, of course, but, hey, braces are braces!-).
> "Paying" some nanoseconds (for the purpose of using a clear, readable short word instead of braces, brackets and whatnots) is generally affordable (it's mostly the cost of lookups into the built-ins' namespace, a price you pay every time you use a built-in type or function, and you can mildly optimize it back by hoisting some lookups out of loops).
> So, I'm generally the one who likes to write dict() for {}, list(L) in lieu of L[:] as well as list() for [], tuple() for (), and so on -- just a general style preference for pronounceable code. When I work on an existing codebase that uses a different style, or when my teammates in a new project have strong preferences the other way, I can accept that, of course (not without attempting a little evangelizing in the case of the teammates, though;-).
I find {} much more readable then dict() because the former is more common.
On the other hand list(L) more readable than L[:] because the fact that slicing copies is more obscure, and these are not equivalent when L is not a list.
My take would be to ignore performance and generally do what is more commonly done.
> I'm one of those who prefers words to punctuation
It's not punctuation, it's notation. Notation is often essential for readability, we use it for mathematics and music for a very good reason. Consider why JSON is generally preferred to XML. While this guy might have a preference for pointless verbosity, most likely the person who needs to read his code won't.
While it points at interesting questions about Python's internals, I hope people writing Python realize that optimizing it is pointless, except for cases where you change the complexity class of an algorithm.
The performance of pure python code is orders of magnitude worse than non-interpreted languages, there's no point trying to shave off 0.5% off a 5000% difference.
This logic makes no sense, you're saying that trying to boost performance by small increments 0-10% isn't worthwhile on a Civic because it'll never be a Bugatti? That's the least helpful advice when you have a real-life application written in Python and want to get some easy wins on your tight loops.
Also Python internals do have some guarantees but in this case it's a semantic guarantee. Because builtins can be shadowed you'll always have to pay the performance cost of looking them up which isn't true for {}.
Considering Python's library ecosystem and the general expanse of Python code, some might say it is the right tool. It's far from ideal, though. A slew of mediocre decisions just begets more, I suppose.
It's unlikely to make an important difference. That's why it's a bad idea to spend time on it. It's much more likely there are other more impactful changes you can do to improve performance, changing the algorithm or using another better tool for the job.
Understanding how python's data structures work is actually really important and can make the difference between an algorithm completing in microseconds vs "basically never". Performance problems in python are going to come down to these crucial factors of when to use a set vs list vs dict or a dataframe or something else. The fact that it's interpreted is just not that significant compared to how data structures are set up and operated on. People are using python to explore and manipulate very large data - it's the most popular language for doing so.
I know hasty optimisation is the root of all evil, but even so this has got to be a bit too far in the other direction.
A 2x speed improvement is very significant if it happens to be in the critical loop of your code. You want a 5000% difference? Just find 6 tweaks like these and you might just get there.
Of course in most cases whatever you're doing to build the dict is more likely taking up most of the time, not building the dict itself, but understanding why one option is 2x as fast is still important. Though one can only hope that it will soon be irrelevant when JITed python becomes a thing.
> A 2x speed improvement is very significant if it happens to be in the critical loop of your code. You want a 5000% difference? Just find 6 tweaks like these and you might just get there.
You're off by at least an order of magnitude, and a very common class of performance problems that exists in any language is copying and allocating memory. Python is no different, and there are plenty of facilities to address that and other normal performance issues. If you're talking strictly about CPU bound performance problems, that's a bit of a red herring considering there is an entire ecosystem of Python tools to write performant CPU bound code.
A little late to reply, but you're of course right, there's no direct way to compare the performance. Python serves (among others) a niche, where it's basically the glue between highly optimized C libraries like numpy. If you're writing that kind of CPU bound code, Python just passes blocks of memory around and it's fine for that. I think this is what you're talking about.
I think outside of that niche, though, there are places where people are writing heavily CPU bound code in Python, because it is so easy to become CPU bound. Case in point: I recently sped up an ML ingestion pipeline by multiple thousands of percent by switching from a pure-python PDF library to one that wraps a .so written in C.
So my point, restated: if you're CPU bound in pure Python code and you have time to try and optimize it, just rewrite the critical section in C and use the FFI. This is how 90% of "Python" libraries get implemented anyway. Compared to this, trying to make Python code more CPU efficient is a waste of time.
By the way, I have done work on CPU performance optimizations in Go, Rust and C, and the things you'd typically do are not possible in Python anyway. You're basically left with randomly tweaking the code until the benchmark gives a thumbs up, because it hit on some cpython idiosyncrasy that will completely change around a few versions later.
Optimisation isn't necessarily handwritten assembly, sometimes you just need to squeeze a little extra juice out of your current setup whether that happens to be a python application or some c code.
Sure, but if it's a python application, you have two options:
1) Tweak the code without understanding* of how it'll affect performance, because every single line of code hides behind it such complexity that any improvement you find is almost certainly overfitted to your version of Python. Eventually a benchmark will spit out a nice number. If successful, make a modest improvement, maybe 30%.
2) Rewrite the critical section in C. If you need to optimize further, you are now able to draw on 50 years of know-how in a well-studied field. The improvement will be on the order of thousands of percent.
They both take about the same amount of time. Why should you do (1)?
* There is a difference between random tips like "replace {} with dict()" and fundamentals like how the CPU cache works, or the branch predictor. The former is almost certainly a quirk of the current Python version, the latter has been the same for the past 20-30 years. If you do work on software performance, you rely on your knowledge of the fundamentals and a tool like `perf` to make educated guesses about where you can save some cycles. These fundamentals are basically irrelevant to Python code and so you have to make effectively random attempts.
(I generally agree that everyone should almost always choose the more readable version)
In this case the speed difference will likely always be there, since `dict()` can be overriden (monkey-patched) - hence the interpreter needs to resolve it at runtime.
Check out the Python bytecode in the article, and remember that bytecode is only generated once from the text of the program (but executed many times). "{}" is translated into a simple "make a dict" instruction.
I don't mean optimization is pointless, I mean trying to optimize Python code for CPU performance is pointless. If you're going to spend the time to do that, just rewrite the critical section in a reasonable language, then wrap it in an FFI and use that. Optimize the C code if you need to.
Python is filled with these kinds of traps where there's more than one way to do the same thing but one way is faster for no reason. It's such a bad language.
These days, a decent compiler will output the same machine code, but it wasn't always true, and ++i was faster.
And really, the "dict() vs {}" performance issue can probably get fixed by the interpreter, but it's such a micro-optimization that it's not worth the time and effort.
I think C++ actually has a very similar problem here, the complexity is worse but I'd bet the runtime isn't an issue because, as you said, the compiler should handle it. The specific problem here is that python has multiple ways to do things that seem pretty much equivalent but that have different run times. Another example is for loops, list comprehensions, and map. All of these things iterate over a collection but all have different run times. That does not make much sense imo.
> All of these things iterate over a collection but all have different run times.
List comprehensions return a list, and map returns something, while a for loop doesn’t necessarily return anything. I’d expect the overhead of constructing the returned object has an impact on performance.
Go for example has fewer ways to write things. With Python, you have to guess what the idiomatic pattern is and it's often not obvious and specific to python. Go went as far removing the ternary operator to cut down on this kind of thing. Writing things in the idiomatic way is more obvious because there's usually only one way to do a given thing.
> With Python, you have to guess what the idiomatic pattern is and it's often not obvious and specific to python.
This is true of every language. You just don't know Python. You probably decided a long time ago that you hate it because of X reason (usually whitespace-as-syntax, as that's probably the most controversial Python design choice), and so refuse to learn anything more and decide you just hate everything.
> Go went as far removing the ternary operator to cut down on this kind of thing.
It surprises me when people get confused by the ternary operator. I never thought it was hard. I wish Python had it. Yes, I know, Python has "a = b if c else d", but that's just weird as it flips the order of expressions around to something that doesn't make much sense.
I've been using python for 11 years bud. I know all the idioms. When I say it's a bad language, it's coming from years of experience using it in production environments. But yeah the problem is I never learned a language I've probably written more code in than any other language.
The ternary is just one example. It's not confusing, it's just indicative of the philosophy of python's design, and it's hardly the worst example.
There's more awful things if you keep digging. The above poster set up a strawman about people not liking the whitespace matters aspect, and while agree that's bad, if you have any modern editor it's not really a big problem. I prefer braces but it's hardly a problem. The actual issues include:
- The global interpreter lock. Multiprocessing is a hacky fix to a problem that most other mainstream languages do not have.
- Performance in general. Python is double digits times or more slower than a lot of other languages at a lot of tasks.
- Weak type system. This is the most glaring issue for large projects. You can add type hints but it's obvious the language wasn't designed with it in mind. You need several layers of linters (mypy, black, etc) to fix this problem, and even then your codebase probably has untyped files. You also won't know if your program is broken until run time.
- Python 3 is not backwards compatible with Python 2. I know why they did this but that doesn't mean this wasn't an expensive problem for a lot of orgs, and 2.7 remains in use in a lot of places and is now a security vulnerability.
- The package management and set up process kind of suck. This is ironic since python is one of the most suggested languages for new devs.
One of advantages python has is that it's easy to use, and that has attracted a lot of development that has lead to a wide variety of libraries including ones that use c and Fortran under the hood like numpy to do things much faster than python. The ease of use also should mean that there's a very large pool of devs to hire from, and that theoretically you're optimizing for lower dev time spent developing. In practice I think the value trade off is worse than people think, especially for larger codebases.
GIL and performance issues are specific to the interpreter, not the language itself. PyPy is evidence of that.
3 to 2 backwards compatibility issues simply weren't possible to prevent. Python had a lot of design mistakes (inadequate differentiation between strings and bytes was one I ran into a lot) that made it impossible to fix without breaking things. Sure, they could have made `print` as a statement still work, but I think it's better that things obviously designed in Python 2 to break.
The weak type system...I dunno I kind of like it. It prevents a lot of the ridiculousness that Java has, where every piece of middleware needs to do type introspection and reflection to get anything done. And it doesn't try to "just make things work" the way JavaScript does and create bizarre rules where {} + {} is NaN and 1 + {} is "1[object Object]". I can certainly see how it can become troublesome on large project.
I think the biggest problem with package management is that `pip` wants to install everything as a system package by default (which requires root), rather than going into a user directory. VirtualEnvs did a lot to fix package management.
Hey now, it's possible to have written hundreds of thousands of lines of Python and worked on the internals (back in Python2 days) and still think it's terrible. (I don't, really, but that's because I don't try to use it to write web services.)
I think part of the problem is even comparing Python to Go. They are both "programming languages", but in one of them the expression `x + y` translates to the CPU doing about 2-10 things, and in another the CPU does anywhere between ~1,000 and a few 100,000 things. (I'm ignoring the FFI here.)
It's not like Python does 5,000 things just because it's poorly designed! Actually, I wouldn't want to write math code or do physics simulations in any other language. But conversely, I think if you try to write a backend RPC service in Python, you are just using the wrong tool for the job. Maybe that's because it's what you know and that's fine, but let's not pretend it's a good choice on technical metrics.
In most ways, Python is much closer to bash than it is to Go/Rust/JVM/C. Just like I wouldn't wanna write shell scripts in C, I wouldn't want to write backend code in Python.