Performance Analysis of Python's Dict() and {}

trostaft · on Jan 22, 2024

Unless I've misunderstood the comparison, the lists example does not appear to be comparing apples to apples. Reproducing here:

  $ python -m timeit "list((1, 2, 'a'))"
  5000000 loops, best of 5: 53 nsec per loop
  $ python -m timeit "[(1, 2, 'a')]"
  10000000 loops, best of 5: 30.4 nsec per loop

The first command produces a list of three elements, whereas the second produces a list of one element (being the tuple).

masklinn · on Jan 22, 2024

You are correct.

To µbench this, you'd want to use `-s` to set up the base object which both snippets then convert.

    > python -m timeit -s "t = (1, 2, 'a')" "list(t)"
    10000000 loops, best of 5: 39.8 nsec per loop
    > python -m timeit -s "t = (1, 2, 'a')" "[*t]"
    10000000 loops, best of 5: 25.1 nsec per loop
    > python -m timeit -s "[1, 2, 'a']"
    50000000 loops, best of 5: 4.28 nsec per loop

(this is 3.11.2 on an M1 Pro)

tczMUFlmoNk · on Jan 22, 2024

Shouldn't you omit `-s` in the last one? I think you're just effectively testing an empty loop, since you're only actually creating the list in the setup phase.

On my machine, the tests that actually do work are:

    $ python3 -m timeit -s 't = (1, 2, "a")' 'list(t)'
    5000000 loops, best of 5: 44.8 nsec per loop
    $ python3 -m timeit -s 't = (1, 2, "a")' '[*t]'
    10000000 loops, best of 5: 22.9 nsec per loop
    $ python3 -m timeit '[1, 2, "a"]'
    10000000 loops, best of 5: 23.1 nsec per loop

...compared to the empty ones:

    $ python3 -m timeit -s '[1, 2, "a"]'
    50000000 loops, best of 5: 5.43 nsec per loop
    $ python3 -m timeit ''
    50000000 loops, best of 5: 5.8 nsec per loop

(11th gen Intel i7 laptop, Python 3.10.6)

masklinn · on Jan 23, 2024

You’re completely right, I missed stripping it, my bad (and from the disassembly below we can see that the second and third do about the same operation so they should have very similar if not identical performances, I should have caught that).

charlieyu1 · on Jan 22, 2024

The third method should be clearly the fastest one, the other two we are building a tuple then convert to a list

masklinn · on Jan 22, 2024

`-s` stands for "setup", the building of the tuple is only done once, and it is not part of the benchmark.

All three versions only bench the construction of the list, two of them from a tuple, while the third is a list literal. However if you plug it into `dis` you'll see that it compiles to loading a const tuple and creating a list from that

    >>> dis.dis("[1, 2, 'a']")
      0           0 RESUME                   0
      1           2 BUILD_LIST               0
                  4 LOAD_CONST               0 ((1, 2, 'a'))
                  6 LIST_EXTEND              1
                  8 RETURN_VALUE
    >>> dis.dis("[*t]")
      0           0 RESUME                   0
      1           2 BUILD_LIST               0
                  4 LOAD_NAME                0 (t)
                  6 LIST_EXTEND              1
                  8 RETURN_VALUE

sweezyjeezy · on Jan 22, 2024

I don't think this will refute the article, but I would have found it more convincing if the benchmark had included a single setitem assignment as well, to be sure that the difference wasn't python doing a lazy dict-or-set type assignment on {}

extasia · on Jan 22, 2024

I've also had this thought, but found that inspecting the type shows its by default a dictionary, and that it only is interpreted as a set if you treat it as such (eg add comma-seperated elements when instantiating)

assert type({}) == type(dict())) assert type({1,2,3} == type(set())

ezo · on Jan 22, 2024

I’m wonder, who’s thinking that dict() is more readable???

kevincox · on Jan 22, 2024

For string keys it can be as it is `dict(a=1, b=2, c=3)` vs `{"a": 1, "b": 2, "c": 3}`. There is a lot of `"` noise on the latter.

formerly_proven · on Jan 22, 2024

50% less noise with ‘ though dict is still slightly easier to read; you can also convey that the keys are supposed to be valid identifiers.

tweakimp · on Jan 22, 2024

I wish I could do {a=3, b=2}

coldtea · on Jan 23, 2024

You can in Javascript - but with the caveat that the keys can only be strings.

Without that, if Python allowed it, would it do given:

  a = "xxx"
  test_dict = {a=3, b=2}

would it take test_dict to mean {"xxx":3, "b":2} or {"a":3, "b":2} ?

JS does the latter always, so the variable a is not related to the literal key a which is understood as an unquoted string "a":

  > a = "xxx"
  'xxx'
  > test_dict = {a:3, b:2}
  { a: 3, b: 2 }
  > test_dict["xxx"]
  undefined
  > test_dict["a"]
  3

lifthrasiir · on Jan 24, 2024

The assignment operator `=` can never appear in a valid Python expression, so `{a=3, b=2}` should be distinguishable from `{a: 3, b: 2}`. (So does JS, where `{[a]: 3, [b]: 2}` would evaluate a and b.)

pwdisswordfishc · on Jan 23, 2024

At this point, just use a dataclass.

paulddraper · on Jan 22, 2024

Of course, for anything except string keys, that isn't even an option.

coldtea · on Jan 23, 2024

No great reason it couldn't be though:

dict(1=5, 6=7, 7="aaa")

dict(hashableInstance="foo", anotherHashableInstance=23, (1,5)=8)

except Python syntax choices of course.

paulddraper · on Jan 23, 2024

Is 1 a integer or a string?

coldtea · on Jan 23, 2024

It's an integer (in the, not supported, format I show as something Python could have supported).

And if they wanted a string, they'd do it like:

dict("1"=5, 6=7, 7="aaa")

shoo · on Jan 22, 2024

`{x, y}` is the set containing the two elements `x` and `y`, `{x}` is the set containing only the element `x`, so "surely `{}` is the empty set" !

pdonis · on Jan 22, 2024

If Python had had built-in set notation from the start, {} might indeed have been a notation for an empty set instead of an empty dict.

However, Python didn't even have sets at all until version 2.3, and they were in a stdlib module instead of being a built-in type until version 2.6. By that time dict notation was well entrenched.

pphysch · on Jan 22, 2024

One place where I prefer dict() is where dicts are used as "nested kwargs", e.g. in Django's ORM for passing default values.

    User.objects.get_or_create(
      username="bob",
      defaults=dict(
        email="bob@example.com",
      )

this makes it easy to move fields between `defaults` and `kwargs`. Granted, you could achieve the same isomorphism with ** operator, but less readably IMO.

locuscoeruleus · on Jan 22, 2024

I think dict(name=name, age=age) can be definitely be more readable and ergonomical to write than {"name": name, "age": age}.

mikepurvis · on Jan 22, 2024

I still miss coffeescript's ability to do dict(@name, @age) though I understand that it creates an unwelcome coupling between the parameter names and their names in the calling scope.

For my part, those names are often the same anyway, though, since the calling scope names are often arbitrary and might as well match the parameter names.

lights0123 · on Jan 22, 2024

    {name, age}

does the same in modern JS, assigning the variable to the key with the same name.

nerdponx · on Jan 22, 2024

dict() can be useful for enforcing (or indicating to the reader) that keys are strings and valid Python identifiers. {} allows arbitrary keys, as long as the data type is hashable.

empiko · on Jan 22, 2024

I think it's more readable. It matches with the type name, it is consistent when you define different structures (list, set, dict vs [], set, {}), you can pass it as a function, etc. [] and {} is unnecessary syntax sugar, not to mention how the same parentheses are overloaded when sets are defined.

colpabar · on Jan 22, 2024

You could argue that since `{}` is also technically an empty set, `dict()` is more explicit.

masklinn · on Jan 22, 2024

Except `{}` is never an empty set.

colpabar · on Jan 22, 2024

> x = set()

> x

set()

Well now I know! However I still think the shared usage of curly braces for dicts and sets could be somewhat confusing.

> x = {1, 2, 3}

> x

{1, 2, 3}

(But maybe no one should listen to me because I don't know how to do code blocks on HN.)

tczMUFlmoNk · on Jan 22, 2024

Indent by four spaces. :-)

pwdisswordfishc · on Jan 23, 2024

The empty set is `{*()}`.

alanwreath · on Jan 23, 2024

never seen

    {*()}

but I guess it is equivalent to

    set()

I've noticed this in python in another place

    f"{var}"

is the same as

    str(var)

there was some other place I can't recall also noticing that (at least my perception understands that) Python is trying to kill my code golf tendencies.

That said, it's probably just my own superstition and of course it doesn't remove my premature optimization tendencies...

empiko · on Jan 23, 2024

this is beautiful

munch117 · on Jan 22, 2024

I do. dict() costs me less eye strain to recognise out of the corner of my eye.

Why the triple question marks? You could have just asked the question straight.

wirrbel · on Jan 22, 2024

less clutter for string-only keys.

Cannabat · on Jan 23, 2024

If you want to implement the suggested changes across your codebase, you can use the `flake8` plugin `flake8-comprehensions`, or `ruff`'s `C4` ruleset, which is an implementation of `flake8-comprehensions`.

justinl33 · on Jan 22, 2024

Best way to optimize a python dict is to switch languages and use a dict there

paulddraper · on Jan 22, 2024

Your perf will get killed by overhead.

bigbillheck · on Jan 22, 2024

The difference is 20ns, and if that's enough for you to care about you've got much bigger worries (like "using python").

frakt0x90 · on Jan 22, 2024

I guess I see this post as more about profiling, discovering unintuitive differences, and exploring python internals than offering 20ns optimizations. I personally appreciated it.

sjwhevvvvvsj · on Jan 22, 2024

(+20ns x # calls made) can be significant if you’re dealing with billions of executions though.

If nothing else a search/replace for “dict()” to “{}” could yield an aggregate performance bump with basically zero refactoring cost.

Also re Python in general: love it or hate it, it “won” as the language of ML. If you want to use ML libraries you’ll be in python.

ayhanfuat · on Jan 22, 2024

If you are building billions of dicts, the bottleneck will not be how you initialize them as empty dicts for sure.

sjwhevvvvvsj · on Jan 22, 2024

Distributed systems can get very big.

queuebert · on Jan 22, 2024

I have a book on the shelf called "High Performance Python".

Seriously, though, if every Python program in the world was 0.1% faster, that would probably save a lot of energy.

nomel · on Jan 22, 2024

Bring globals into local scope, never use '.' in tight loops, avoid function calls, etc [1], sure make ugly python code. But, wow, it really can make a difference.

[1] https://wiki.python.org/moin/PythonSpeed/PerformanceTips

physPop · on Jan 22, 2024

I'm not sure how many of those are valid any more.

viraptor · on Jan 22, 2024

You have to compare that to the amount of energy spent while making it 0.1% faster. On average, I'm not sure we'd be saving any energy. The one bit that would get any remotely relevant gain would be offset by lots of tools where that improvement happens once a week.

How many minutes / how much energy would it take you to setup a project, run a profiler, do the fix, run tests/CI, commit, release, etc. -vs- how much energy would that 0.1% change save over years?

IshKebab · on Jan 22, 2024

I agree. If you're benchmarking this then you're using the wrong language.

But I hadn't considered the fact that `dict` can be overridden. That was interesting.

arsome · on Jan 22, 2024

Yeah I would be much more concerned about things which cause exponential time increases, like developers who brute force search lists instead of using dictionaries than how they declare their dictionaries unless I was at the point of profiling and the dict() constructor was somehow near the top of the list.

wyldfire · on Jan 23, 2024

A previous analysis of the same question (2.7, 2012): https://doughellmann.com/posts/the-performance-impact-of-usi...

mgaunard · on Jan 22, 2024

does your colleague also uses list instead of []?

If you're in Python, you should embrace the syntax.

pphysch · on Jan 22, 2024

The first "benchmark" shows dict() being twice as fast, but the rest of the article concludes "The {} is always faster than dict."

nathiss · on Jan 22, 2024

Yes, you're absolutely right. I apologize for this copy-paste mistake. It's fixed now.

michaelhoffman · on Jan 22, 2024

I wonder if they swapped the benchmark results somehow. Because on my computer, I get roughly the same times but with {} twice as fast as dict().

Epa095 · on Jan 22, 2024

Damb am I happy you commented this! I saw the same, checked it twice, and a final time before posting. But then the results were suddenly as expected (dict() being slower). At least I am not going mad, he must have changed it just now.

ayhanfuat · on Jan 22, 2024

From Alex Martelli , a prominent Python expert [1]:

> I'm one of those who prefers words to punctuation -- it's one of the reasons I've picked Python over Perl, for example. "Life is better without braces" (an old Python motto which went on a T-shirt with a cartoon of a smiling teenager;-), after all (originally intended to refer to braces vs indentation for grouping, of course, but, hey, braces are braces!-).

> "Paying" some nanoseconds (for the purpose of using a clear, readable short word instead of braces, brackets and whatnots) is generally affordable (it's mostly the cost of lookups into the built-ins' namespace, a price you pay every time you use a built-in type or function, and you can mildly optimize it back by hoisting some lookups out of loops).

> So, I'm generally the one who likes to write dict() for {}, list(L) in lieu of L[:] as well as list() for [], tuple() for (), and so on -- just a general style preference for pronounceable code. When I work on an existing codebase that uses a different style, or when my teammates in a new project have strong preferences the other way, I can accept that, of course (not without attempting a little evangelizing in the case of the teammates, though;-).

[1] https://stackoverflow.com/a/2745292/2285236

mgraczyk · on Jan 22, 2024

I find {} much more readable then dict() because the former is more common.

On the other hand list(L) more readable than L[:] because the fact that slicing copies is more obscure, and these are not equivalent when L is not a list.

My take would be to ignore performance and generally do what is more commonly done.

nephanth · on Jan 22, 2024

My problem with {} is that the syntax for dicts {x1:y1, x2:y2 ...} and sets {x1, x2 ...} Are very close, and {} is slightly ambiguous.

I've run into bugs because i wrote {} for the empty set. Since then i write dict() and set()

xarope · on Jan 23, 2024

+1. I've run into this too, and ended up using set() to clarify.

dr_kiszonka · on Jan 23, 2024

You could write, e.g., `flags: set = {}` to get both speed and readability.

grumpyprole · on Jan 22, 2024

> I'm one of those who prefers words to punctuation

It's not punctuation, it's notation. Notation is often essential for readability, we use it for mathematics and music for a very good reason. Consider why JSON is generally preferred to XML. While this guy might have a preference for pointless verbosity, most likely the person who needs to read his code won't.

t8sr · on Jan 22, 2024

While it points at interesting questions about Python's internals, I hope people writing Python realize that optimizing it is pointless, except for cases where you change the complexity class of an algorithm.

The performance of pure python code is orders of magnitude worse than non-interpreted languages, there's no point trying to shave off 0.5% off a 5000% difference.

Spivak · on Jan 22, 2024

This logic makes no sense, you're saying that trying to boost performance by small increments 0-10% isn't worthwhile on a Civic because it'll never be a Bugatti? That's the least helpful advice when you have a real-life application written in Python and want to get some easy wins on your tight loops.

Also Python internals do have some guarantees but in this case it's a semantic guarantee. Because builtins can be shadowed you'll always have to pay the performance cost of looking them up which isn't true for {}.

attractivechaos · on Jan 22, 2024

I think the analogy is more like: why open a can with a better screwer when you can open it with a can opener. Use the right tool for the right task.

vacuity · on Jan 23, 2024

Considering Python's library ecosystem and the general expanse of Python code, some might say it is the right tool. It's far from ideal, though. A slew of mediocre decisions just begets more, I suppose.

geysersam · on Jan 22, 2024

It's unlikely to make an important difference. That's why it's a bad idea to spend time on it. It's much more likely there are other more impactful changes you can do to improve performance, changing the algorithm or using another better tool for the job.

uses · on Jan 22, 2024

Understanding how python's data structures work is actually really important and can make the difference between an algorithm completing in microseconds vs "basically never". Performance problems in python are going to come down to these crucial factors of when to use a set vs list vs dict or a dataframe or something else. The fact that it's interpreted is just not that significant compared to how data structures are set up and operated on. People are using python to explore and manipulate very large data - it's the most popular language for doing so.

shiandow · on Jan 22, 2024

I know hasty optimisation is the root of all evil, but even so this has got to be a bit too far in the other direction.

A 2x speed improvement is very significant if it happens to be in the critical loop of your code. You want a 5000% difference? Just find 6 tweaks like these and you might just get there.

Of course in most cases whatever you're doing to build the dict is more likely taking up most of the time, not building the dict itself, but understanding why one option is 2x as fast is still important. Though one can only hope that it will soon be irrelevant when JITed python becomes a thing.

pi-e-sigma · on Jan 22, 2024

> A 2x speed improvement is very significant if it happens to be in the critical loop of your code. You want a 5000% difference? Just find 6 tweaks like these and you might just get there.

Optimizations don't compound like this.

willseth · on Jan 22, 2024

You're off by at least an order of magnitude, and a very common class of performance problems that exists in any language is copying and allocating memory. Python is no different, and there are plenty of facilities to address that and other normal performance issues. If you're talking strictly about CPU bound performance problems, that's a bit of a red herring considering there is an entire ecosystem of Python tools to write performant CPU bound code.

t8sr · on Jan 23, 2024

A little late to reply, but you're of course right, there's no direct way to compare the performance. Python serves (among others) a niche, where it's basically the glue between highly optimized C libraries like numpy. If you're writing that kind of CPU bound code, Python just passes blocks of memory around and it's fine for that. I think this is what you're talking about.

I think outside of that niche, though, there are places where people are writing heavily CPU bound code in Python, because it is so easy to become CPU bound. Case in point: I recently sped up an ML ingestion pipeline by multiple thousands of percent by switching from a pure-python PDF library to one that wraps a .so written in C.

So my point, restated: if you're CPU bound in pure Python code and you have time to try and optimize it, just rewrite the critical section in C and use the FFI. This is how 90% of "Python" libraries get implemented anyway. Compared to this, trying to make Python code more CPU efficient is a waste of time.

By the way, I have done work on CPU performance optimizations in Go, Rust and C, and the things you'd typically do are not possible in Python anyway. You're basically left with randomly tweaking the code until the benchmark gives a thumbs up, because it hit on some cpython idiosyncrasy that will completely change around a few versions later.

remus · on Jan 22, 2024

Optimisation isn't necessarily handwritten assembly, sometimes you just need to squeeze a little extra juice out of your current setup whether that happens to be a python application or some c code.

t8sr · on Jan 23, 2024

Sure, but if it's a python application, you have two options:

1) Tweak the code without understanding* of how it'll affect performance, because every single line of code hides behind it such complexity that any improvement you find is almost certainly overfitted to your version of Python. Eventually a benchmark will spit out a nice number. If successful, make a modest improvement, maybe 30%.

2) Rewrite the critical section in C. If you need to optimize further, you are now able to draw on 50 years of know-how in a well-studied field. The improvement will be on the order of thousands of percent.

They both take about the same amount of time. Why should you do (1)?

* There is a difference between random tips like "replace {} with dict()" and fundamentals like how the CPU cache works, or the branch predictor. The former is almost certainly a quirk of the current Python version, the latter has been the same for the past 20-30 years. If you do work on software performance, you rely on your knowledge of the fundamentals and a tool like `perf` to make educated guesses about where you can save some cycles. These fundamentals are basically irrelevant to Python code and so you have to make effectively random attempts.

pyuser583 · on Jan 22, 2024

It’s also pointless because Python internals have no guarantees.

Changing version or even platform can change the absolute efficiency.

H8crilA · on Jan 22, 2024

(I generally agree that everyone should almost always choose the more readable version)

In this case the speed difference will likely always be there, since `dict()` can be overriden (monkey-patched) - hence the interpreter needs to resolve it at runtime.

pyuser583 · on Jan 22, 2024

Isn’t some logic necessary to determine if you’re dealing with a dictionary or set? Or dictionary/set comprehension?

You could probably override it using Python’s extension system.

Forgive me, I know Python very well, but not C/C++. I’ve learned to make no assumptions.

H8crilA · on Jan 23, 2024

Check out the Python bytecode in the article, and remember that bytecode is only generated once from the text of the program (but executed many times). "{}" is translated into a simple "make a dict" instruction.

einpoklum · on Jan 22, 2024

If optimization were pointless I'd be out of a job tomorrow.

(My current day job is doing GPU optimization on iterative medical image reconstruction computation. But of course, I don't do that in Python.)

t8sr · on Jan 23, 2024

I don't mean optimization is pointless, I mean trying to optimize Python code for CPU performance is pointless. If you're going to spend the time to do that, just rewrite the critical section in a reasonable language, then wrap it in an FFI and use that. Optimize the C code if you need to.

__loam · on Jan 22, 2024

Python is filled with these kinds of traps where there's more than one way to do the same thing but one way is faster for no reason. It's such a bad language.

Sohcahtoa82 · on Jan 22, 2024

Every language has these.

In C/C++, which is faster:

i++;

or

++i;

These days, a decent compiler will output the same machine code, but it wasn't always true, and ++i was faster.

And really, the "dict() vs {}" performance issue can probably get fixed by the interpreter, but it's such a micro-optimization that it's not worth the time and effort.

__loam · on Jan 23, 2024

I think C++ actually has a very similar problem here, the complexity is worse but I'd bet the runtime isn't an issue because, as you said, the compiler should handle it. The specific problem here is that python has multiple ways to do things that seem pretty much equivalent but that have different run times. Another example is for loops, list comprehensions, and map. All of these things iterate over a collection but all have different run times. That does not make much sense imo.

mr_toad · on Jan 23, 2024

> All of these things iterate over a collection but all have different run times.

List comprehensions return a list, and map returns something, while a for loop doesn’t necessarily return anything. I’d expect the overhead of constructing the returned object has an impact on performance.

encomiast · on Jan 22, 2024

Compared to what?

__loam · on Jan 22, 2024

Go for example has fewer ways to write things. With Python, you have to guess what the idiomatic pattern is and it's often not obvious and specific to python. Go went as far removing the ternary operator to cut down on this kind of thing. Writing things in the idiomatic way is more obvious because there's usually only one way to do a given thing.

Sohcahtoa82 · on Jan 23, 2024

> With Python, you have to guess what the idiomatic pattern is and it's often not obvious and specific to python.

This is true of every language. You just don't know Python. You probably decided a long time ago that you hate it because of X reason (usually whitespace-as-syntax, as that's probably the most controversial Python design choice), and so refuse to learn anything more and decide you just hate everything.

> Go went as far removing the ternary operator to cut down on this kind of thing.

It surprises me when people get confused by the ternary operator. I never thought it was hard. I wish Python had it. Yes, I know, Python has "a = b if c else d", but that's just weird as it flips the order of expressions around to something that doesn't make much sense.

__loam · on Jan 23, 2024

I've been using python for 11 years bud. I know all the idioms. When I say it's a bad language, it's coming from years of experience using it in production environments. But yeah the problem is I never learned a language I've probably written more code in than any other language.

The ternary is just one example. It's not confusing, it's just indicative of the philosophy of python's design, and it's hardly the worst example.

E: it's actually been 11 years. I'm getting old

Sohcahtoa82 · on Jan 23, 2024

The ternary operator as a concept is fine, but I certainly dislike Python's implementation and really wish they went with C's version.

IMO, "a = b ? c : d" is much more readable than "a = c if b else d"

In most cases, I love the philosophy of Python's design.

__loam · on Jan 23, 2024

There's more awful things if you keep digging. The above poster set up a strawman about people not liking the whitespace matters aspect, and while agree that's bad, if you have any modern editor it's not really a big problem. I prefer braces but it's hardly a problem. The actual issues include:

- The global interpreter lock. Multiprocessing is a hacky fix to a problem that most other mainstream languages do not have.

- Performance in general. Python is double digits times or more slower than a lot of other languages at a lot of tasks.

- Weak type system. This is the most glaring issue for large projects. You can add type hints but it's obvious the language wasn't designed with it in mind. You need several layers of linters (mypy, black, etc) to fix this problem, and even then your codebase probably has untyped files. You also won't know if your program is broken until run time.

- Python 3 is not backwards compatible with Python 2. I know why they did this but that doesn't mean this wasn't an expensive problem for a lot of orgs, and 2.7 remains in use in a lot of places and is now a security vulnerability.

- The package management and set up process kind of suck. This is ironic since python is one of the most suggested languages for new devs.

One of advantages python has is that it's easy to use, and that has attracted a lot of development that has lead to a wide variety of libraries including ones that use c and Fortran under the hood like numpy to do things much faster than python. The ease of use also should mean that there's a very large pool of devs to hire from, and that theoretically you're optimizing for lower dev time spent developing. In practice I think the value trade off is worse than people think, especially for larger codebases.

Sohcahtoa82 · on Jan 23, 2024

GIL and performance issues are specific to the interpreter, not the language itself. PyPy is evidence of that.

3 to 2 backwards compatibility issues simply weren't possible to prevent. Python had a lot of design mistakes (inadequate differentiation between strings and bytes was one I ran into a lot) that made it impossible to fix without breaking things. Sure, they could have made `print` as a statement still work, but I think it's better that things obviously designed in Python 2 to break.

The weak type system...I dunno I kind of like it. It prevents a lot of the ridiculousness that Java has, where every piece of middleware needs to do type introspection and reflection to get anything done. And it doesn't try to "just make things work" the way JavaScript does and create bizarre rules where {} + {} is NaN and 1 + {} is "1[object Object]". I can certainly see how it can become troublesome on large project.

I think the biggest problem with package management is that `pip` wants to install everything as a system package by default (which requires root), rather than going into a user directory. VirtualEnvs did a lot to fix package management.

t8sr · on Jan 23, 2024

Hey now, it's possible to have written hundreds of thousands of lines of Python and worked on the internals (back in Python2 days) and still think it's terrible. (I don't, really, but that's because I don't try to use it to write web services.)

I think part of the problem is even comparing Python to Go. They are both "programming languages", but in one of them the expression `x + y` translates to the CPU doing about 2-10 things, and in another the CPU does anywhere between ~1,000 and a few 100,000 things. (I'm ignoring the FFI here.)

It's not like Python does 5,000 things just because it's poorly designed! Actually, I wouldn't want to write math code or do physics simulations in any other language. But conversely, I think if you try to write a backend RPC service in Python, you are just using the wrong tool for the job. Maybe that's because it's what you know and that's fine, but let's not pretend it's a good choice on technical metrics.

In most ways, Python is much closer to bash than it is to Go/Rust/JVM/C. Just like I wouldn't wanna write shell scripts in C, I wouldn't want to write backend code in Python.