Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't articulate specifically why but using typing in Python just feels like so much pain compared to other languages that have "opt-in" nominal typing syntax (PHP et al.).

The workflow feels frustrating, the ecosystem seems diverse and has no clear "blessed" path, and I'm still confused about what is bundled by Python and what I need to pull in from an external source. I REALLY want to use mypy but by the time I've figured out how to pull it all together I probably could have finished the program I'm working on.

The relevant factor here might be the size of Python programs I typically work on, somewhere between a few hundred lines and a few thousand.

I'm glad other people are having success because hopefully that'll smooth the pavement for the next time I circle around and try to add some meaningful types to my Python programs.



I think something that might help you understand a little better is this: Python includes everything you need to write type hints, but it doesn't include any type checking functionality. They're making a dynamic language where you can write types if you want, but they don't really do anything. To check your types, you need to use a third-party typechecker like Mypy.


I appreciate your response but I'm actually a pretty experienced Python programmer. I get it conceptually, I've even used it in practice just to get a feel for it, I just found the whole experience _painful_!


The gods intend that you use those annotations with a checker and some IDE that supports autocompletions and highlighting based on those.

They basically lure you into annotating types all over the place so that their products can work better.

When you work in an editor that doesn’t support those fancy things, you’ll feel exhausted pretty soon.


There are a few reasons why it sucks so much:

1. Most of the Python community doesn't really get the importance of static typing. As a result tons of dependencies completely lack typing, or are written in a dynamic way that is incompatible with static typing.

2. You have to go to some significant effort to set up static type checking in CI, and because of the above issue getting 0 errors is much harder than in e.g. Typescript.

3. The semantics of type hints are not defined so there are multiple incompatible interpretations. Better hope your dependencies use the same type checker as you!

4. MyPy is way better than nothing but it's actually pretty buggy and absolutely full of unsoundness and legacy hacks (not that surprising given its age).

I would strongly recommend using Pyright if you make the unfortunate mistake of starting a new project in Python. It is much better than MyPy and the author is some kind of bug fixing robot.


Personally, I like type hints at any program size. I do understand the benefit isn’t as obvious in small programs. But the value skyrockets when you have a large codebase that you can’t keep in your head.

All that said, Mypy is far worse than TypeScript. I think Mypy would be so much better if it improved support around dict inference. It shouldn’t just say “I dunno, it’s a dict of strings?” but rather infer an anonymous TypedDict


MyPy is good, and works as expected and similar to how typescript and flow works for javascript.

The problem with Python is that most code is shit. Sorry, but "fighting the compiler" because you want to return different types based on some flag isn't the compiler being annoying, it's the compiler exposing bad code.

It's not until before summer typing for django actually started to work somewhat. All stubs did was to make mypy not complain, but not until now it actually can catch some errors.

Getting a python project typed is a world of pain, and in the end you can't really trust it anyways because of all the hacks making the type-checker happy.


> MyPy is good, and works as expected and similar to how typescript and flow works for javascript.

Last time I used MyPy (1.5-2 years ago?) it wasn't nearly as full-featured or ergonomic as TypeScript. Has that changed recently? I remember running into lots of cases that it didn't support that were easy in TS, although it was long enough ago that I don't have specific examples.


It certainly still feels half baked. Ironically the best type checker imo, pyright, is written in typescript.


I love me some python, it's my favorite and most-versed language by miles, but typescript definitely gets the W when it comes to its type system. Idk exactly why - both js and python "bolted on" their type systems (syntactic superset), both are highly dynamic, yet python really struggles with typing systems and JITing. I think python is simply "more dynamic" than js. Js has a much simpler data model.


> I think python is simply "more dynamic" than js. Js has a much simpler data model.

I don't think it has much to do with that. There are some major differences between TypeScript and typed Python that explain why TS is so successful and typed Python is not:

1. TypeScript is its own programming language with its own syntax. Even though it's a superset of JavaScript, this still matters. The only special, type-related syntax Python supports is type annotations; defining and otherwise using types is done via Python's normal syntax, which is not very ergonomic or intuitive.

2. Microsoft invested a ton of engineering effort into TypeScript. The Python ecosystem simply has not received that large of an investment in static typing.

3. The Python programming language does not dictate how type checking should be done, so engineering effort is scattered across several projects like MyPy and Pyright. By contrast, TypeScript has essentially "won" that battle on the JavaScript side. (I know Flow exists, but I don't know anyone who uses it for new projects today—everything is done with TS.)


> I can't articulate specifically why but using typing in Python just feels like so much pain compared to other languages that have "opt-in" nominal typing syntax (PHP et al.).

Cognitive Load.


Using type checking in Python is always the wrong thing to do. You need to learn to use Microservices where you split up a large code base into seperate 3k lines Microservice chunks of code that interact over well documented APIs.

That's just how effective Python is done.


I don't think you're understanding their use case, they aren't building some Flask service, they're an ML drug discovery company supporting research scientists and as such they're more or less building a framework. There's a lot of code reuse and most large orgs building similar frameworks have adopted similar code quality standards - think PyTorch, Fairseq, Pytorch Lightning, Huggingface - which is why tools like MyPy / PyLint / Black exist. For people who have used those libraries there is a night-and-day difference between the anything-goes ones and the ones that do linting and unit testing.


In my personal experience codebases using MyPy tend to be considerably worse than codebases that don't.

By using a static type checker on a dynamically typed language, you have admitted you don't know what are you doing right of the bat. This means the software engineers in the project are very bad and therefore the code quality overall will also be bad.

Tools like MyPy exist to make Python appear to be more like Java to help Java developers, who aren't willing to learn how to code in a different programming paradigm.

That code will always be far worse than Python code written using the Python development paradigms.


> This means the software engineers in the project are very bad and therefore the code quality overall will also be bad.

Conversely, I do think you're very bad and don't have much experience with large python codebases.

For example the suggestion to split and use microservices makes no sense. Microservices are even harder to refactor.


What are some examples of popular python libraries that use this ideal python development paradigm?


Most of the Python libraries that are not locked to specific versions of Python, which is often the case with a lot of the badly written ML libraries.


Actually now I'm thinking you might be trolling.


You forgot to end with /s


I've seen Python projects using Monorepo MyPy and Python projects using Microservices.

Python Microservices style when it comes to code quality, maintainability, etc... wipe the floor with Monorepo MyPy style projects. It's not even close.


And I've seen the exact opposite. Messy, poorly/under/incorrectly-documented buggy microservices which barf on corner cases, painful to refactor, no way to verify correctness without tons of unit and integration tests. Conversely, I've seen huge heavily-typed mono repos which are a breeze to operate on, wrap my head around, jump to definitions, automatically refactor, and actually run with confidence.

So do our anecdotes cancel out?


No, because you can always replace bad microservices wholesale. It can't be "painful to refactor", because you literally just delete the code and rewrite it.

If you have introduced static typing, you then have to start doing refactoring and verifying correctness. "heavily-typed mono repos which is a breeze to operate on" I highly doubt such a thing exists, more likely you are used to a certain level of bad code and don't understand it can be better.


> I highly doubt such a thing exists, more likely you are used to a certain level of bad code and don't understand it can be better.

Ohhh trust me, I know bad code. And I know good code. My unit of measurement is how fried I feel at the end of the day. Dynamic loosy-goosy python? Brain fried, constant debugging, little confidence in deploys. Static types, pydantic, pycharm, mypy, DI? I'm in the zone all day.

I don't think you are the arbiter of all code, so I'm not sure what grounds you have to tell me what my taste in code is.

Additionally, you would be better served by writing comments with less presumption in them. It makes the discourse more adversarial than it needs to be.


That's horrific. The crazy thing is I think you might be serious and not trolling. It's like something out of a Dilbert comic.

Typing is good, anyplace you can get the computer to check more of your work - the better. There are practical limits and trade-offs as always, but some typing is better than none.

Microservices are usually a terrible idea. I know they're popular, but I've only had bad experiences with them. I strongly recommend against microservices in most situations.


I don't think one size fits all. There are plenty of successful python projects that don't match this. E.g., every python library (numpy as microservices doesn't make sense). Plenty of large successful Django apps exist. I am using typing effectively in solo dev flask webapp that doesn't need to be split into microservices.

Sorry if your post was sarcastic and it went over my head.


I don't think Numpy is mostly written in Python. It's C.

So using type hints to annotate your flask api, etc. to generate docs is great. No problem.

The issue is that there have been a lot of Java developers switching to Python and they want to pretend that Python is a statically typed language (and use tools like MyPy) because it's more familar to them.

Python is not a statically typed language and treating it as one leads to terrible results. Especially in large projects.

The issue of course if that the Java developers have no point of reference on what a successful large Python project looks like. So they think they are doing great with MyPy when if you compare what they are outputting to a proper large Python project, it's pretty clear they are doing terribly.

When you use Python properly you write self contained microservices. Typing information exists but only on the external interfaces e.g. API. You also don't share business logic code between the self contained microservices because that massively decreases the maintainabily of the overall system. You show your Python code is correct by using Unit Testing and Mocking.

Basically, there is a way to do large Python projects and MyPy (and other static type checking tools) have no place in that story. It only exists to support people in making bad decisions on their codebase.


Numpy has a large amount of python code. C is definitely used in places, but majority of numpy is python. Not all operations need C implementations especially as many can be built on top of other python functions which do wrap C. Tensorflow/pytorch similarly have a large amount of python code. I mostly work on developing a library that builds on tensorflow and is like 50K lines of code. Dividing that library into services makes little sense. One of the recent typing new features is mostly devoted to numerical ecosystem in python (variadic generics to do template like types for matrices). I don't want to use dynamically typed language, but python is clear leader in ML ecosystem. A lot of my department does work that builds on ML research/libraries and those are mostly found in python.

If other languages had comparable ML ecosystems then a different language may have been chosen. But at moment today there's no competitor anywhere close. One lazy metric, I would estimate 90%+ of research papers to ML conferences are in python.


"Dividing that library into services makes little sense." No, but dividing the library into smaller sub-libraries, that only interact with each other over well-documented interfaces, does make sense. It's called encapulation.

If at some point you decide the way one of those sub-libraries works is wrong, could be done better, etc. then you can write a new sub-library that just provides the same public interface.

One of the reasons why Python is so successful is it's usage of dynamic typing. Obviously, if you lose static typing some else needs to take it's place, in Python's case, stronger encapulation.


> If at some point you decide the way one of those sub-libraries works is wrong, could be done better, etc. then you can write a new sub-library that just provides the same public interface.

That's...that's literally how type systems and interfaces work. You do know Python supports behavioral subtyping (Protocol), right?

It sounds like you certainly have had some bad experiences with poorly-written typed python. But that speaks more to those maintainers not knowing how to actually use types effectively, vs a shortcoming of static typing in python. Python's type system has plenty of shortcomings, but has plenty of escape hatches as well.


The size of the enclosed section is much bigger. A large section of code has an small well-defined interface.

What you get with typing and interfaces is that all the code has an interface. Small sections of code have a large badly-defined interface.

The main reason people are using typing in Python is to support large Monorepos. Because everything is typed in them, all the code in the repository depends on all the other code in a spaghete dependency graph.

This results in the following issues:

* Small code changes lead to hour long unit test runs since most of the unit tests need to be rerun for every change.

* It's impossible to update the Python version since all the Python code needs to be updated at the same time.

* Long check-in times for changes running MyPy over 100k lines of code.

* Maintainability issues since code can't be updated without knock on effects all over the codebase.

Okay, so what are the benefits of using types in Python:

* On average MyPy type checking will catch 1 bug per developer per year, which wouldn't of being caught normally.

* It makes people who previously coded in statically typed languages feel more familar with the code.

Basically, if you look at the Pros vs Cons, you should ditch the typing checking. It's a net negative to the codebase.


> Small code changes lead to hour long unit test runs since most of the unit tests need to be rerun for every change.

You're writing unit tests wrong if they are taking that long. Unit tests should be quick.

> It's impossible to update the Python version since all the Python code needs to be updated at the same time.

I've done it, so, objectively not impossible.

> Long check-in times for changes running MyPy over 100k lines of code.

TFA addresses this. It can be greatly sped up with use of caching in CI.

> Maintainability issues since code can't be updated without knock on effects all over the codebase.

That's exactly why static types are better - automatic refactors.

> On average MyPy type checking will catch 1 bug per developer per year, which wouldn't of being caught normally.

I catch several per day, simply from IDE highlighting, in real time, and fix them immediately, which only works because of the type system. (maybe it's wrong to call these bugs at this point, it's more like proto-bugs which never even get checked in cause there is immediate feedback)

> It makes people who previously coded in statically typed languages feel more familar with the code.

I came from a fully-dynamic-everything python world, and types were a breath of fresh air.

You've clearly been burned by (a) bad codebase(s), but I think you are drawing all the wrong conclusions. All the benefits of narrow interfaces, easy refactors, high maintainability, can be had with static typed python. I will admit it's much more challenging to be really competent at it than it ought to be, but this has been improving rapidly.


Encapsulation is a concept that is orthogonal to whether or not you use typing annotations in a language. You're conflating two completely different things and wrongly assuming that typed code is ipso facto poorly encapsulated. As most of your arguments stem from this premise, I find them very weak.


the enraged responses are pretty funny given that this is literally just what the BEAM vm is conceptually.

Trying to bolt a static type system on a dynamic object oriented language literally violates any benefit you get from using a dynamic language in the first place. (that is to say, change things as they run).

I have no idea why people try to enforce the programming paradigm of Java on Python. if you want a huge, static program... write it in a static language, don't write it in Python.


If I have a function that returns something in a list, why not declare it as such? The pain in Python typing comes more from bolting a largely (but not entirely) nominal type system on a structurally typed language. That can be addressed by Protocols.


This is sarcasm, right?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: