Python Best Practices for a New Project in 2021

dpatterbee · on July 5, 2021

> some Linux distributions already switched to version 3

At the lightning pace of only 12 years since the release of python 3 (and 15 months since python 2's deprecation), some linux distribution have ALREADY switched to it.

skohan · on July 5, 2021

The developer story around python 2/3 is emblematic of why I don't consider it a suitable language for software development outside of very controlled cases like notebooks.

smcl · on July 5, 2021

I dunno, they made some big breaking changes but continued supporting 2.x for over a decade after 3.x. There are definitely reasons I would not consider Python for some projects, but the 2/3 transition is not one of them. To be honest the event is so infamous (and IMO a little overblown) that they basically couldn't do this anymore, and I think it's been openly stated that they wouldn't.

skohan · on July 5, 2021

What gets me about it is that the 2/3 transition was just a massive unsolved problem, and one which caused pain over a decade after 3.x was released. Other ecosystems managed to solve it - like Rust for example handles editions in a very sound way - but with Python you had so many problems cropping up over the years over python versions.

formerly_proven · on July 5, 2021

Okay, I'm just going to call bullshit on this.

There's this tiny, very-very vocal minority of people who continue to mystify the 2/3 changes and act as-if migrating code were some massively difficult herculean task that almost ended Python. Sometimes a little bit of "string/bytes split is ALL WRONG and utter non-sense!" mixed in.

The truth is, that these were not actually big changes. The truth is that some orgs just wanna sit on their lazy asses doing jackshit to maintain their stuff and are surprised that, after 15 years, they have to do MAINTENANCE WORK??? ON THEIR CODE??? This is an outrage! The truth is that if porting was hard, or difficult, the code most likely had no meaningful tests, and so you couldn't do any changes anyway. The truth is, that changes like these expose bad engineering. And guess what, people don't like being exposed.

skohan · on July 5, 2021

It's not about the code changes, it's about having to worry about two distinct and incompatible versions of python existing at the same time. So if I type `python` into the terminal, I don't know which programming language it's expecting without knowledge of the system.

This is an especially bad problem with an interpreted language, since changing the system version of python can break code at a distance. I can run a program today and it works, and tomorrow it won't because somebody changed the symlink to the other python version. That's why there's this huge mess of environment management and containerization.

I will give you an example of the problem. When ubuntu finally removed python 2.7 as the default, some of the workflows broke because scripts were depending on it. Scripts that I didn't write, or even know about.

This is only a problem because nobody bothered to solve it. I think one correct solution for example would have been to have both python versions live inside the same executable, with some kind of flag or something added to run against python 3. Just some standardized way of handling the different versions would have made a huge difference and saved a decade of pain.

pseudalopex · on July 5, 2021

PEP 394 was accepted in 2012. It says python2 should run Python 2 and python3 should run Python 3. Until 2019 it said python should run Python 2. And minor versions of Python 2 could be incompatible anyway. So the 2 to 3 transition didn't really change anything.

Not depending on the system version of anything is normal advice for other interpreted languages too.

Linux doesn't split shebang arguments. So you can't use env and pass a flag.

dragonwriter · on July 5, 2021

> PEP 394 was accepted in 2012. It says python2 should run Python 2 and python3 should run Python 3. Until 2019 it said python should run Python 2.

IIRC, there were a few linux distros that bucked this; at least one (Arch I think, but it wasn't one I used) switch “python” to Python 3 quite early, as I recall, and I think there were some others that did there own thing.

pseudalopex · on July 5, 2021

It was Arch. PEP 394 was written to tell everyone else it was a bad idea.

rbanffy · on July 5, 2021

Shotgun, meet foot.

kortex · on July 5, 2021

Well, I think some companies had a massive conversion burden, but these are typically massive companies, like Google. Most companies had nowhere near as much python2 code.

I think you hit the nail on the head with "lack of meaningful tests". The entirety of "locked to 2" python code I've dealt with was a serpentine mess with little/no tests and so brittle even going from print to logger risked things breaking.

"But it currently works!"

rbanffy · on July 5, 2021

> "But it currently works!"

Miracles happen, but people shouldn't count on them.

darkerside · on July 5, 2021

I think most people rightly saw that the juice wasn't worth the squeeze for a 2/3 upgrade, and that the only reason we were going ahead with it was the sunk costs. If they wanted to create a new language that wasn't backwards compatible, why not make it a totally new and separate project?

Daishiman · on July 5, 2021

This is analogous to saying that if your code was running on IE6 then why bother upgrading it to run on newer versions of Firefox or Chrome.

The only difference between the Python3 migration and other projects' major version changes (PHP5-7, .NET4 to newer versions, etc) is that migrating to Python 3 had the equivalent of a couple of major version's changes all rolled into one. Yes, it's painful; major version migrations are not easy.

But it's really no different from not upgrading JS code to keep up with the latest browser security issues, or not upgrading your JVM (there are _tons_ of Java codebases stuck on ancient JVM versions).

If you had a problem with migrating to Python 3 after a decade you would've still had the same problem with any other major software upgrade in your infra.

skohan · on July 5, 2021

But a lot of python code is just scripts. Not everyone is going to go back and migrate every random script they wrote which just renames some files or something just because of "security issues" or something like that.

Also versioning is not an issue, the problem is that Python didn't provide any viable versioning strategy. You just had two separate executables now, and you, as the user, have to decide where to put them in your $PATH and how to make sure the right code gets executed with the right version. That should have been baked into the language or the tooling itself.

Daishiman · on July 5, 2021

> But a lot of python code is just scripts. Not everyone is going to go back and migrate every random script they wrote which just renames some files or something just because of "security issues" or something like that.

That's not really a problem because the Python maintainers conveniently provided a package to do trivial 2-to-3 migrations.

> Also versioning is not an issue, the problem is that Python didn't provide any viable versioning strategy.

There were explicit backporting libraries created to manage the transition. Django depended on these for many years to succesfully support both Python 2 and Python 3 packages.

> You just had two separate executables now, and you, as the user, have to decide where to put them in your $PATH and how to make sure the right code gets executed with the right version. That should have been baked into the language or the tooling itself.

If you're executing a script, Python supports shebang notation to determine your executable.

Sorry but none of these are real problems. The only place where you had migration issues were with very large packages that did complex string manipulation stuff, and that's the sort of code the requires maintenance in any language anyway. It's not like there isn't an enormous amount of precedent in these kinds of migrations (Ruby minor versions break. PHP's 4->5->7 migrations were huge).

skohan · on July 5, 2021

> That's not really a problem because the Python maintainers conveniently provided a package to do trivial 2-to-3 migrations.

That's not the point. For example, when ubuntu dropped python 2.x as the default, some of my workflows got broken because of scripts which I didn't write and didn't even know about. There's plenty of old code which people still depend on. System upgrades should not break your projects, or require you to go in and do surgery on scripts you didn't write.

> There were explicit backporting libraries created to manage the transition. Django depended on these for many years to succesfully support both Python 2 and Python 3 packages.

That's a workaround. A solution would have allowed python 2 & 3 to coexist with no additional effort.

> If you're executing a script, Python supports shebang notation to determine your executable.

Again, you're depending on the people who wrote the python code you depend on to handle this in the correct way. It's not something which is built into the ecosystem.

I'm sorry, but all your arguments seem to boil down to the fact that there are ways to make python usable despite the extreme fragility of the toolset.

dragonwriter · on July 5, 2021

> For example, when ubuntu dropped python 2.x as the default, some of my workflows got broken because of scripts which I didn't write and didn't even know about.

The only solution to that is neve breaking backward compatibility because either someone might prematurely upgrade (not what happened woth Ubuntu) or someone might not maintain software (what seems to have happened with the third party tools in question, whether it was the original maintainer or some packager or...) and might also not vendor dependencies, on the assumption that external environments will never change.

> A solution would have allowed python 2 & 3 to coexist with no additional effort.

PEP394 allows that, if you depend on a particular python major version use “python2” and “python3” to refer to it: both side by aide installation and continuity of operation over the time “python” switches from 2 to 3 is provided.

While some linux distros broke the recommendation on when to switch “python” targets, that would be transparent to anyone following the recommendation.

Daishiman · on July 5, 2021

> There's plenty of old code which people still depend on. System upgrades should not break your projects, or require you to go in and do surgery on scripts you didn't write.

On Linux land projects that depend on system libraries and `*-dev` packages break constantly across major versions.

> That's a workaround. A solution would have allowed python 2 & 3 to coexist with no additional effort.

You can do that if you make your python interpreter version explicit in your shebang.

> I'm sorry, but all your arguments seem to boil down to the fact that there are ways to make python usable despite the extreme fragility of the toolset.

A toolset that gave you a decade to upgrade with ample warnings is the opposite of fragile.

drcongo · on July 5, 2021

Because then they'd be maintaining two languages.

rbanffy · on July 5, 2021

> MAINTENANCE WORK??? ON THEIR CODE??? This is an outrage!

I usually keep a CI branch with dependencies unpinned (I made pip-chill for this use case) precisely to make sure I'll be the first to know when something coming from the future breaks my code.

Being lazy is a virtue, but not if it cause an engineer to avoid work that needs to be done.

whoknowswhat11 · on July 5, 2021

The problem was in part because they actively broke the 2/3 compat story.

Ie someone into Unicode on 2 with u'' got hosed on 3 despite 3 going on and on about importance of string handling. How hard would it have been to support u so someone could support both versions more easily? It was insanity.

kortex · on July 5, 2021

> How hard would it have been to support u so someone could support both versions more easily?

Exceptionally hard. Py2 str objects are tantamount to py3 bytes but with the py3 str apis. What this amounts to is every "str" in py2 is an untagged union of str/bytes. Python is exceptionally dynamic. This means if you allow different behavior in different modules, you risk silent, customer-data-corrupting bugs, among other headaches, as things continue to "work" but do the wrong thing.

At least with a hard 2/3 switch, you are on your toes and know there is a (mostly) finite transition period.

from __future__ import str basically guarantees you'll have "3-compatible code" causing headaches years into the future.

That's not even to touch on the difficulty of switching the engineering difficulty of interop of encoding-oblivious strings with unicode ones.

pseudalopex · on July 5, 2021

Python 2 let you put a u in front of a string literal to make it unicode. Python 3 made it a syntax error at first. Python 3.3 restored it so people could write code compatible with 2 and 3 without importing unicode_literals.

formerly_proven · on July 5, 2021

> How hard would it have been to support u so someone could support both versions more easily?

Python 3 has supported writing u"foo bar" for a very long time, I think starting with 3.2 or so.

Similarly, Python 2 accepts b"foo bar".

pseudalopex · on July 5, 2021

It was Python 3.3. 4 years into what started as a 7 year plan.

leetrout · on July 5, 2021

Eh. That’s all passed. Now we’re arguing about pip vs pipenv / pip file or poetry.

kissgyorgy · on July 5, 2021

No we are not :) poetry won, it's just way better than pipenv. In fact, pipenv was just always shit.

leetrout · on July 6, 2021

Have you seen PDM?

https://pdm.fming.dev/

wwwhizz · on July 5, 2021

"best practices"

`curl https://pyenv.run | bash`

hmmm...

bananabiscuit · on July 5, 2021

Is this much worse than downloading some installer and running it? Those can be just as compromised. So can packages in package managers for that matter.

e12e · on July 5, 2021

Depends. On windows an installer might be signed. On Linux a package should be signed.

You can't know that curl and your browser get the same data - but you can for example split it up:

  curl https://pyenv.run -o install.sh
  #examine install.sh
  bash install.sh

Ed: or just "save as" like with an installer.

Piping straight to bash can be especially bad if you've cached sudo credentials for the current session - some of these scripts call sudo "inside".

Otoh - the connection is signed (it's https)-unfortunately it's often quite easy to compromise a web site. Obviously, listing gpg signatures on the same page doesn't add much unless it's possible to verify the gpg key some other way.

Ed: another problem is that you really should check exactly what's in you clipboard before pasting to a terminal.

maccard · on July 5, 2021

The safety in your steps is reading the script, not in avoiding curl | bash. An installer being signed doesn't guarantee it's not malcious; if someone has overtaken a host and replaced the binaries, they'll just sign them themselves. Unless you're manually inspecting the signature matches your expected source, running a signed binary doesn't save you.

e12e · on July 5, 2021

> running a signed binary doesn't save you.

True, it does not. I don't recommend downloading (random) binary installers and running them either.

With eg Linux isos, you typically already trust the signing key for your os updates.

But unless you are vigilant about your ssl root certs, you'll easily allow a lot of malicious and incompetent services to potentially intercept most of your ssl traffic... (due to there being many trusted roots by default).

> if someone has overtaken a host and replaced the binaries

This again depend on who and how the binaries are signed, and how the signatures are trusted. Typical windows (and Mac?) setups will gobble up any signature. But if you do check who signs the binaries - then the signing key will easily be the most secure part of the system - a compromised ftp/web site allow hosting malicious binaries, but typically not grant access to the signing key.

With letsencrypt a hacked web site will typically have access to a valid ssl cert - no need to further compromise mx/mail records or gain access to a business phone number etc.

A ascii-armor signed shell script can be distributed safely via a paste-bin. Unfortunately there's no good automatic/standard way to do so. Or rather no standard tool to prompt to trust the signing key - and then run the script - beyond basic gpg --search-key --key-server.. + gpgv.

Maybe signed git repos would be easiest - but I don't know how easy it is to limit which keys are trusted - if it's possible at all?

The helm project does a little dance to try and verify downloads - but for all the effort it pretty much amounts to trusting the script, not the keys/signatures:

https://github.com/helm/helm/blob/v3.6.2/scripts/get-helm-3#...

I was hopeful sequoia might help - but apparently its sqv tool is even worse than gpgv - neither can handle an ascii armored public key, and sqv can only handle detached signatures.

And just for completeness - a reminder that any cut'n'paste in the terminal is a bad idea: https://nakedsecurity.sophos.com/2016/05/26/why-you-cant-tru...

maccard · on July 5, 2021

Completely agree with everyhing you say here!

e12e · on July 5, 2021

> The safety in your steps is reading the script, not in avoiding curl | bash

Well, yes. The safety is in doing something between "acquire potentially malicious payload" and "running payload". I don't see how "safety [is] not in avoiding curl | bash" when, avoiding the direct pipe to bash is exactly what I suggest.

If you look at the url, then curl and pipe that url, you have no idea if bash sees what you just reviewed.

maccard · on July 5, 2021

But you have exactly the same problem with downloading a binary, or running pip install. You have no idea what that code does, so curl | bash doesnt hurt any more than any other normal methods of installation.

e12e · on July 5, 2021

It's a little easier to read a 10 line bash script than a 35 mb binary installer?

maccard · on July 6, 2021

Do you read the source of every setup.py you run before running pip install? Also, if you are untrusting of the source enough to verify their install script is safe, why would you install their template to run on your machine without verifying all of that too? Finally, 10 line bash script might (as tbis example does) just call out to another curl | bash, or to a pip install/npm install.

e12e · on July 6, 2021

> Do you read the source of every setup.py you run before running pip install?

I generally run make, setup.py, cargo build etc in the context of cloning a source repository. I certainly could do a better job of sanity-checking those things, but I do try. And I definitely try to avoid having sudo credentials cached when I do - to foil "sudo cp artifact /usr/sbin" and other awful things people do, because they found it convenient.

> Also, if you are untrusting of the source enough to verify their install script is safe, why would you install their template to run on your machine without verifying all of that too?

I generally trust people more to write "left pad" than install scripts. Many sysadmins are good programmers, few programmers are even remotely decent sysadmins in my experience.

> Finally, 10 line bash script might (as tbis example does) just call out to another curl | bash

In which case one has to chase down the rabbit, or give up.

Sometimes one will discover that the end game was downloading a gpg signed tar archive with the release artifacts - and one can go and do that.

> or to a pip install/npm install.

People do do awful stuff in makefiles and package install scripts, but for vanilla python/Javascript - the lazyness of programmers tend to work to our advantage - there be little extra madness/magic in there.

Sute, running pip install -r requirements.txt can do almost anything - but it's unlikely to run your package manager under sudo and mess up your system packages, or add something questionable to your package sources.

rbanffy · on July 5, 2021

> Is this much worse than downloading some installer and running it?

Yes.

You should inspect what you download.

Also, you should probably use the Python interpreters provided by your Linux distro, that stay in directories you usually can't write to and come in signed packages. On a Mac, the next best thing would be MacPorts.

rovr138 · on July 5, 2021

Nothing here overrides there system wide Python version.

The article specificially goes into why not to do it.

> pyenv allows us to set up any version of Python as a global Python interpreter but we are not going to do that. There can be some scripts or other programs that rely on the default interpreter, so we don’t want to mess that up.

rbanffy · on July 5, 2021

Using the Python interpreters in your system doesn't mean you can't make virtualenvs out of them - it's just that they are precompiled and well supported on your specific OS.

I used custom Python interpreters a lot and it's nice to be able to rely on the system to provide a sensible environment instead of forcing myself to build my own.

kim0 · on July 5, 2021

Exactly! At least I can read the script but not the binary!!

jen20 · on July 5, 2021

Perhaps… [1].

[1]: https://www.idontplaydarts.com/2016/04/detecting-curl-pipe-b...

rbanffy · on July 5, 2021

That's why the traditional Windows way of downloading a setup.exe and running it with admin privileges is a bit scary for people coming from other platforms. Installing an .msi is less bad, or so we are taught.

nxpnsv · on July 5, 2021

Well I do `brew install pyenv`, but honestly I am not sure that is much safer...

rbanffy · on July 5, 2021

I think it'll compile various Pythons on your machine under your user. I'd prefer to install (learned this today) with Homebrew multiple versions (not sure how possible it is) as `brew install python@3.6 python@3.7 python@3.9` (because Big Sur has 3.8 built-in).

In reality, I'm a more traditional Unix person and prefer MacPorts, where you can do `sudo port install python36 python37 python39` in a very BSD way of doing things.

Homebrew has broken my computer one time too many.

maccard · on July 5, 2021

What's the problem here?

The script is served over https, so it's not going to be tanpwred with (unless you have a malicious cert, but at that point you can't trust anyone), and curl | bash isn't any worse than downloading a script and just running it, or running a precompiled binary you don't trust.

doix · on July 5, 2021

pyenv could get taken over and you won't know. It's also possible to detect when someone is piping to bash (on the server) and serve a different payload [0]. You're better off piping curl to a file, reviewing the file and then running it manually.

[0]: https://www.idontplaydarts.com/2016/04/detecting-curl-pipe-b...

maccard · on July 5, 2021

> its also possible to detect when someone is piping to bash (on the server) and serve a different payload.

If you have a fear of your source maliciously serving you different code over curl, don't run their code at all.

> You're better off piping curl to a file, reviewing the file and then running it manually.

Right, but the safety there is reviewing the code. Running brew install, npm install, pip install, or a binary could all run malicious code too.

KronisLV · on July 5, 2021

Shouldn't there be some CLI tool that would allow verifying the checksum of the file as an intermediate step?

Something along the lines of:

  curl https://pyenv.run | pass_on_through_sdtout_if_hash_matches md5 8bffaf30c9ba21393329d531063056fe | bash

That way, someone who validates the file locally, can be sure that what's piped is the same thing.

maccard · on July 5, 2021

Yes, there absolutely should be. It would be a massive improvement if that happened.

It requires a few extra steps to be actually secure. You actually need to verify the hash from a trusted source for it to be actually secure. If the delivery has been tampered with, you need to ensure that the delivery of the hash has also not been tampered with. In practice, codesigning is the solution, but certs are expensive, and impractical for a small project.

KronisLV · on July 5, 2021

How about the hash being something that you calculate locally?

  1. (local) Download the file from the URL.
  2. (local) Review it locally, in a text editor.
  3. (local) Get its hash locally, from the file in your file system.
  4. (SSH) Feed this hash into the fictional tool above.
  5. (SSH) If what curl gets is the same as the file that you've reviewed, it gets piped further into bash, otherwise the execution stops and an error is output.

Of course, that's only applicable to this particular case, where a compromised server could detect that a bash pipe is used and return different file contents. That would only be useful in situations where you want to review it on a local device, such as a desktop and run it on a remote one, such as a server.

Edit: If you want to review it remotely, there's nothing to prevent you from using less or something to view it before manually opening it with Bash. That just requires the discipline to not use one liners that both download and run it, as long as no such tool like the above exisdts.

maccard · on July 6, 2021

I cant believe I'm going to suggest a blockchain but I think what you really want is: - run `cu-sh example.com/questionable.` - this uses `$editor` to let you review the contents (skippable with a command line flag) - generate a hash of your local contents - check said hash against a blockchain to see if everyone else who got it got the same contents as you. - decide from 1 and 2 above whether you actually want to proceed with the install.

You could replace blockchain with checking if it's signed, and the key matches an owner on keybase/github/some other federated identity provider too.

nerdponx · on July 5, 2021

You often want to do this anyway, because the installer often supports various options and env vars. If you download the file you can read its --help output, and even keep it on hand in case something bad happens, or just for your own records.

jayknight · on July 5, 2021

It's also possible for you to copy things you can't see from web pages. So the command(s) you end up with may not be what you thought. So there's a trust issue with the site you get instructions from ass well.

See http://thejh.net/misc/website-terminal-copy-paste

Cthulhu_ · on July 5, 2021

The request itself won't be tampered with, but what if the host was? That endpoint could be compromised and send you a different script.

They should offer a download with signature validation instead. Signed by Apple, Microsoft, etc if possible.

maccard · on July 5, 2021

If you're afraid the host may be untrusted then you would be wrong to download any of their code at all.

The safety is in reviewing the code there, not in avoiding curl | bash. Running pip install or npm install is just as dangerous.

> They should offer a download with signature validation instead. Signed by Apple, Microsoft, etc if possible.

If the host is compromised, the attacker will just get Microsoft to sign their malware instead; see [0]. If the host is compromised, and you run the code without reviwing it, you're hosed regardless.

[0] https://arstechnica.com/gadgets/2021/06/microsoft-digitally-...

dragonwriter · on July 5, 2021

> The request itself won't be tampered with, but what if the host was?

What if your distro package repository was?

ktm8 · on July 5, 2021

The script itself is also a wrapper for curl | bash

lvncelot · on July 5, 2021

It's curl | bash all the way down.

gjvc · on July 5, 2021

Every time I see one of these type of posts I look to see the section on how they deploy their work to different environments.

Every time I fail to find such a section.

reacharavindh · on July 5, 2021

With Python, the only sane choice is probably through Docker images :-(

This is the reason why I consciously attempt to move away from Python and choose Go or Rust for new projects if possible. Of course, on existing projects, Python deployment is a pain.

KaiserPro · on July 5, 2021

> the only sane choice is probably through Docker images

Its really not. You either tailor your dependencies to match the host (clue, you should be doing this anyway, it makes things much easier in the long run), or use virtual envs.

pyenv also gives you some (non obvious) flexibility as well

However, having said that, the chances are, your go or rust binary is going to be in a docker env as well(so is your python), so its basically all the same.

blooalien · on July 5, 2021

I'm personally a big fan of (especially) the most recent builds of Python 3, but lately I've been interested in learning Go or Rust just because there's a few things I really like about them both vs some other languages I've looked at.

So, seein' that you've coded in both Go and Rust apparently, my question to you is: Which do you prefer of the two (and why)? I personally lean a bit toward Go, but I haven't learned enough of either to decide absolutely which of the two I should learn first.

reacharavindh · on July 5, 2021

I think it is better to not _decide_ which one to learn. They both have their perks and quirks. If you can, I’d recommend to dip your toes on both. For network services, Go feels super nice because of the high quality libraries in its ecosystem. I feel like it is better Java.

Rust on the hand, feels like a better C++... with a steep learning curve, and useful when you can/need to get something correct/efficient/safe at the cost of increased developer time and cognitive load.

They both have their place IMHO.

blooalien · on July 5, 2021

Nice. Thank you for the info. Do they both have pretty good GUI toolkit support available? I currently use PyQt5 in Python, but I'm not totally against GTK if that's what is easier for Go or Rust. Still think I kinda feel like Go's the one of the two I'm prolly gonna learn first, although you've got me leanin' hard toward also learning Rust in addition to Go.

sitkack · on July 5, 2021

tarballs, rpms or debs also work.

One should probably run their own package server like https://github.com/pypiserver/pypiserver

All of that said, containers are nice because you have a log of what is running, easy to transport and coordinate.

When you use Go and Rust over Python, does the use of Docker disappear? What replaces it?

amrox · on July 5, 2021

> One should probably run their own package server like https://github.com/pypiserver/pypiserver

Never used pypiserver but I’ve had a good experience with https://github.com/devpi/devpi

ramtatatam · on July 5, 2021

Why is deploying through docker images not OK? I always thought that's what docker is meant to solve.

stult · on July 5, 2021

Desktop/local usage, especially for non technical users or people working in environments where they can’t install docker (eg locked down corporate machine). There isn’t a great way to pass around a single executable or installer for a Python app. There’s pyinstaller, but it’s finicky and doesn’t quite work cross platform (you have to build the executable on the target OS).

On the other hand, OP’s setup makes it very easy to publish packages. So you can create a tarball and have a user pip install that, then run your app with a simple CLI. Or publish to pypi if you want it public. The downside is you assume the user has the right version of Python and knows how to switch versions if need be and all the weirdness that comes with that. But for web apps, packaging your app also makes it easy to wrap in a simple dockerfile that basically just installs the package and then runs it.

jnwatson · on July 5, 2021

If you can't run docker, you're just as likely not to be able to run arbitrary executables.

In every modern corporate environment I've been in, you either have the ability to run whatever on your box, or it is locked down tight.

stult · on July 5, 2021

Yes, but if you sign the executable with an acceptable certificate (the company’s cert or some other trusted CA), you can install an arbitrary executable usually. Docker would typically be blacklisted, however. In most corporate IT environments, it’s much easier to get a small, focused executable white listed or code signed than to get something with such a huge attack area as docker approved for regular, non-technical users.

codethief · on July 5, 2021

> There isn’t a great way to pass around a single executable or installer for a Python app.

Have you tried Nuitka or mypyc? (Haven't tried them myself but I've heard very good things about Nuitka.)

nerdponx · on July 5, 2021

Also Briefcase: https://beeware.org/project/projects/tools/briefcase/. "Convert a Python project into a standalone native application."

stult · on July 5, 2021

Personally I have not. I have only had one particularly weird use case where I really needed to deliver a Python executable alongside an Electron app, and made it work with Pyinstaller. We’re actually reworking things to use Pyodide to compile the Python to WASM so we don’t have to deliver any separate Python code at all. It’s a real Rube Goldberg machine of an application, but we had a lot of weird constraints we had to meet (eg, we couldn’t use sockets to communicate between Python and Node, so had to package the Python as an executable). It works beautifully but it was a mess to get running.

willis936 · on July 5, 2021

Docker was made for development and testing of code, not production. It has been co-opted for production with good success. It seems fine to do it for hobbyist projects, but still feels dirty in a commercial setting.

evgen · on July 5, 2021

The largest internet services on the planet were using LXC containers for production ops before you ever heard of the term 'docker'. In its earliest iterations it was easier to push a container to production than to run it on your laptop.

KaiserPro · on July 5, 2021

> using LXC containers for production

which isn't docker.

From what I recall, spotify were the first company with a large footprint to use docker. However for some reason they skipped VMs and went screaming into docker when it was _very_ new. Personally that seemed like a mistake, but you know, each to their own.

If you're wanting to get into a bun fight about containers, then IBM 360 and JCL has some time for you.

ramtatatam · on July 5, 2021

I'm using docker in commercial environments for years and cannot complain... But I don't distribute software, which is where the problem is probably more visible.

Daishiman · on July 5, 2021

If we stopped using software in prod because it feels "dirty" we would not be using computers at all.

iainmerrick · on July 5, 2021

willis936 · on July 5, 2021

Why what?

alephu5 · on July 5, 2021

If you're deploying a REST API with authentication, caching and so on then the KISS solution is python with a framework (such as Django) in a docker container.

I've spent many many hours (years) trying nix, rust, Haskell, go, spring framework and all sorts of other things which are a lot of fun but not so good for getting shit done.

For other domains this doesn't apply of course; lower-level network stuff, portable CLIs, CPU-intensive workloads and so on are much better in go/rust but you can either integrate them with a network call, spawning an OS process or an FFI in the case of rust.

aitchnyu · on July 5, 2021

I'm using Pycharm Pro and it can use Docker for code completion. I use Docker AND venv on host filesystem for friends using Pycharm Community and VSCode. Is this still necessary?

lvncelot · on July 5, 2021

Last I've heard, the nix experience for python was kind of lacking, sadly - but I don't have any first-hand experience to go with that statement.

asplake · on July 5, 2021

I wouldn’t write off platforms such as Elastic Beanstalk and Heroku for web projects, smaller ones of bog-standard shape anyway

KaiserPro · on July 5, 2021

Heroku, yes I would push for that. Its expensive, but it'll save you an entire devop function right up til you hit the 30 people mark.

Elastic beanstalk is just a horrid dev environment. Lots of waiting, lots of non-obvious options, and very little reward. I would personally push for lambda and zappa (https://github.com/zappa/Zappa) for python, as it seems to be much easier to deploy and debug.

StevePerkins · on July 5, 2021

If this is a company project, a REST API or task service of some kind, then you are probably using Docker in the year 2021.

If this is anything else, or if you work for a shop that hasn't embraced containerization, then you use PyInstaller (http://www.pyinstaller.org) to bundle your application. Either into a directory that contains your full Python virtual environment (only 5-10 megs!), or into a single executable file.

The latter is most convenient for a Go/Rust type experience. But the former will startup faster, because that single-file executable has to first uncompress itself to the system temp directory.

inopinatus · on July 5, 2021

Most production deployment goes like this:

    1. Put files on a server,
    2. Start a process.

How much ancillary scaffolding you put around that is a matter of taste.

thundergolfer · on July 5, 2021

If your Python code has native code under the hood (eg. Numpy) this two-step is very much a 'step1: circle, step 2: draw the rest of owl' description of production deployment.

Doxin · on July 6, 2021

If you're deploying to a server you'll want to pip install dependencies on the server. If you're using docker you'll want to pip install dependencies in your container. If you're deploying to an end users computer you'll want to use pyinstaller, which admittedly is not trivial to get working in all cases.

gjvc · on July 6, 2021

This is exactly my point. .venv/ needs to live in the project deployment directory, and needs a wrapper script to use the python binary/symlink therein.

Isolate the project from the python runtime it uses, and you'll always have the right set of packages installed.

None of this is perfect, but an in-tree .venv/ and convenience scripts seems to be the least-worst option.

gjvc · on July 6, 2021

I was overly brief here.

I meant to say "I look to see how they are going to deploy and use the virtualenv they have created."

Hint: what works in development -- just telling people to type "source .venv/bin/activate.sh" or such, doesn't fly in an unattended environment.

All that it requires, of course, is a bin/venv-python wrapper (bash) script to reference the created .venv/ directory, so this is hardly ground-breaking stuff, but as I mentioned originally, this (crucial) section is missed every time.

I use a minimal-but-complete pairing of venv and pip, and a couple of location-independent wrapper scripts, and I can run things the same across all environments.

Dunedan · on July 5, 2021

The headline is pretty misleading, as the mentioned "best practices" are highly opinionated, as the author also states:

> This is my very opinionated attempt to compile some of the best practices on setting up a new Python environment for local development.

formerly_proven · on July 5, 2021

These are pretty solid recommendations overall (pytest, poetry, pyenv). I'd guess the most controversial things would be pre-commit hooks (please no) and using a force-fed autoformatter (DEAR GOD NO).

uranusjr · on July 6, 2021

pre-commit is actually a very solid linter task runner framework, I would still recommend it even if you don’t like Git pre-commit hooks (just omit the `pre-commit install` part); the `pre-commit run` command is still very useful invoked directly (optionally with --all-files and passing a specific hook id to run).

u678u · on July 5, 2021

This is how the internet works now. If you ask a question its rare to get a good answer. If you post a faulty opinion as advice you're more likely get a good answer.

pjmorris · on July 5, 2021

"the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer." - Cunningham's Law [0]

[0] https://meta.wikimedia.org/wiki/Cunningham%27s_Law

EamonnMR · on July 5, 2021

These are fairly close to our working environment (we're using pylint instead of Flake8, haven't switched most of our projects to poetry yet, and we use Pedantic but not MyPy yet.)

tlocke · on July 5, 2021

I think you mean Flake8 rather than Sense8? Anyway, gives me a chance to plug Flake8 Alphabetize, a Flake8 plugin for import ordering https://github.com/tlocke/flake8-alphabetize

claytonjy · on July 5, 2021

How does this compare/contrast with isort?

tlocke · on July 5, 2021

Flake8 Alphabetize will just give you warnings about import order, whereas isort will actually change the code itself.

So as I see it there are two types of tool, formatters and checkers. A formatter doesn't alter the Abstract Syntax Tree (AST). In other words it doesn't alter the meaning of the code, just how it looks (eg. the formatter Black). A checker on the other hand looks only at the AST and just gives warnings in your editor, and then it's up to you to change it or not.

The import order is fixed in the AST, so it falls outside the scope of Black (which never changes the AST). So I felt there was a need for a tool that worked as a checker that just gives warnings in your editor if your imports don't conform to PEP8, and hence Flake8 Alphabetize.

It's worth mentioning that Flake8 Alphabetize follows Black's philosophy of having only one way of doing things, so a project can standardise on Flake8 Alphabetize and everyone's imports will look the same.

Anyway, any feedback is welcomed:-)

claytonjy · on July 5, 2021

The checker/formatter distinction makes a lot of sense. Given the choice between the two for a given goal or set of operations, e.g. alphabetize/isort, I'm not sure why I'd settle for a checker. If the fix is deterministic and objective, why would I want to make the fix myself, rather than let the tool handle it?

tlocke · on July 5, 2021

My feeling is that I want anything that could change the meaning of the code to be down to me making a manual edit. So I'm quite happy for the Black formatter to automatically rewrite my code as it never changes its meaning. For anything else I like to be in the driving seat.

Anyway, that's just my feeling at the moment, I'm sure others have a different take.

niyikiza · on July 5, 2021

Saying "this marble is red" implicitly communicates that "my opinion is: this marble is red".

thundergolfer · on July 5, 2021

The lack of discussion of lock files is a big omission in any 'best practices' page. "Lock" only appears in a log output of Poetry.

Even if the answer is "Poetry handles it" you certainly want to explain why they're important just like is being done in the rest of the "Why use..." sections.

nerdponx · on July 5, 2021

For a generic lockfile, use the pip-compile tool from https://pypi.org/project/pip-tools

beaugunderson · on July 6, 2021

Huge +1 for this; pip-compile is so much faster and less surprising than poetry. Poetry's #1 footgun is that it updates the version of unrelated dependencies when you add a new dependency. No thanks.

pmkiwi · on July 6, 2021

Yeap I am using pip-compile too because I always managed to mess up my environment when using poetry!

0x008 · on July 5, 2021

Yes! Lockfiles are basically the only reason to use poetry over pip+venv or am I missing something?

kuu · on July 5, 2021

I'm a bit surprised that conda is not mentioned anywhere (I'm not saying as best practice, but as an alternative or why envs are better)

matsemann · on July 5, 2021

Conda is one of the most annoying things with Python imo. Or, it's just another symptom of the crazy dependency hell of python, but it's actively making it worse. Now every project is a mashup of conda and pip and probably more.

Just right now I'm trying to fire up a new instance on GCP. With a completely clean image, doing a conda install hangs for 30 minutes while it's trying to "solve" something.

KaiserPro · on July 5, 2021

Conda is a ginormous pain in the arse.

I supported a research cluster, and the amount of times conda caused an issue was a lot.

karxxm · on July 5, 2021

Yup, especially when you are working with os-level dependencies like cuda, conda is the way to go. I think every tensorflow user can approve

dijksterhuis · on July 5, 2021

Quite confused about this comment as many of the latest tensorflow V2 releases aren't reliably uploaded to conda forge. IIRC there's only like 4 or so of the V2 releases uploaded.

You can conda install the CUDA dependencies and then install the required Tensorflow version via conda pip. But that's not much different to installing CUDA manually and then installing tf from system pip.

It's much faster and easier to pull a tf Docker image as it's their "officially supported" way to get up and running.

So... as a tf user and a sysadmin... Nah. No conda for me thanks.

karxxm · on July 5, 2021

One project uses tf 2.1, one uses 1.15 and the other 2.4 ... for me it is much more convenient to have 3 envs rather than 3 containers or switching the system cuda as needed... I especially had problems debugging through docker containers back in the days, therefore i never picked it up again.

codethief · on July 5, 2021

…unless you're working on a project for arm64-based IoT devices (like Nvidia Jetson), where Conda is not available.

hantusk · on July 5, 2021

Does it not work with miniforge/mambaforge? (https://github.com/conda-forge/miniforge)

I use this with my m1 mac, and it works great.

wdroz · on July 5, 2021

conda is now available[1] for arm64 (since March[2]).

[1] -- https://docs.conda.io/en/latest/miniconda.html#linux-install...

[2] -- https://github.com/conda/conda/issues/8297

codethief · on July 5, 2021

Awww man. I must have missed that announcement by only a few days.

laminatedsmore · on July 5, 2021

Conda has been a lifesaver for me in the past, but it got so slow in ~2019 (minutes+ to resolve dependencies) that I've switched back to pip whenever possible. Maybe things have been resolved now though?

E.g. https://github.com/conda/conda/issues/8087, https://www.anaconda.com/blog/understanding-and-improving-co...

sseagull · on July 5, 2021

I have an environment (somewhat complicated but not that bad really) that occasionally takes more than an hour.

Oh and if there is a conflict somewhere, it goes into some conflict detection routine that will take hours and not produce anything useful.

I could go on, but I have come to really dislike conda.

lr1970 · on July 5, 2021

> Conda has been a lifesaver for me in the past, but it got so slow in ~2019

This is why mamba [0] was created. It is a C++ reimplementation of conda for much better performance. mamba is a drop-in replacement of conda and can operate on the same anaconda, condaforge (and mambaforge) repositories.

[0] https://github.com/mamba-org/mamba

sseagull · on July 5, 2021

I do have to try mamba sometime but I feel like there is something more than python slowness going on.

I use Gentoo and its package manager is written in python. Even though it is more complex (IMO) it doesn’t have nearly the same slowness when it comes to dependency resolution and conflict detection.

gillesjacobs · on July 5, 2021

My only problem with conda is the proliferation of package sources and lack of dev support.

pmkiwi · on July 6, 2021

conda makes sense in some circumstances but you need to invest a lot of times to undersand this tool (remind me my Maven times sometimes...).

With all scientist packages, conda is pretty good but it is sooooo slow, it is just insane!

wdroz · on July 5, 2021

Yes, if you are doing anything close to machine learning, conda is the most favored tool.

rwmark · on July 5, 2021

conda is developed by heretics and not the ideologically sound country club.

u678u · on July 5, 2021

conda was necessarily 5 years ago but now pip seems a lot more reliable so I've given up using it.

zo1 · on July 5, 2021

Yeah, now all the "hip" devs are driving things towards "poetry".

It's very disillusioning to see how sheer twitter-followings and "popularity" type metrics drive development these days by forcing alternatives to be de-facto neglected. Everyone does what's "hot", so all the tutorials and bug reports and tests and SO questions and new libraries and and and all go towards that framework or language or tool or method. You can't even argue technical merits towards the neglected options because yes the popular tool is better, but only because we have a metric boat load (millions) of man-hours being pumped into making it better instead of all the alternatives. It's like the tech-equivalent of fashion fads in that it's self-reinforcing. Not to take away from some of the actual and technical achievements that some of these things have made, of course.

jonnycomputer · on July 5, 2021

I guess. But frankly, all the solutions were than not that great. So its not surprising people might want to move to something new.

u678u · on July 5, 2021

You should see our k8s migration, its the biggest waste of time for everyone I've ever seen.

WillDaSilva · on July 5, 2021

Great recommendations all around. It'd be nice if they went into how they handle dealing with C extensions, as that adds a large amount of complexity to a Python project, and most of the published advice on the topic is quite old.

EDIT: Mentioning `pyproject.toml` and the relevant PEPs would also be great (i.e. PEP 621, PEP 517, etc.). Fortunately Poetry is compliant with these PEPs.

gww · on July 5, 2021

I really struggle with this part of Python package development. I have a rather complex C++ library I have been calling with Cython but I am struggling to find the best practices for either compiling the library with setup tools or calling cmake and copying the library into the package etc.

thundergolfer · on July 6, 2021

Look into Bazel. Bazel particularly suited to building Python+{C/C++/Rust} codebases.

It's still early days in the Bazel ecosystem, so you won't find a ready-made solution, but the fundamentals are solid.

gww · on July 6, 2021

Oh cool thanks, I will check it out!

paulross · on July 5, 2021

This might help: https://pythonextensionpatterns.readthedocs.io/en/latest/ind...

WillDaSilva · on July 5, 2021

That's a great resource in general for Python extensions, but doesn't have much to say about packaging and distributing them. For that the best resources I've found have been to look at what large/complex projects that use them do, as they've often had to deal with many of the odd cases that often come up.

Using Cython adds another layer of complexity to the packaging/distribution that that link does not address. Fortunately now that you can specify build requirements in `pyproject.toml` Cython has become significantly easier to use on that front, but there are still some less than obvious bits to say the least.

Maybe I should publish an overview of Python best practices for publishing C extensions (with or without Cython).

gww · on July 5, 2021

That would be a great resource. Most of the information I have found ends up being outdated.

kissgyorgy · on July 5, 2021

For VSCode to recognize the virtualenvs created by Poetry, you just have to add a setting for the Python extension:

    "python.venvFolders": [
        "~/.cache/pypoetry/virtualenvs",
        "~/.pyenv/versions",
    ],

This will make VSCode automatically recognize virtualenv interpreters, so you can select them without starting code from an activated shell.

Tipewryter · on July 5, 2021

I prefer containers over pyenv and poetry. This way not only python version and dependencies are "in one place" but also all other stuff that comes along with a new project. The OS, the database etc.

The one thing I dislike about Python projects is that Python plasters the compile cache files all over the place. Is there a reason to change that? Currently I use the -B flag for all my scripts. But that makes it slow. I wish Python would have an option to perform like PHP and keep cached compilations in memory instead on disk. Or at least somewhere in /tmp/.

bpicolo · on July 5, 2021

Poetry is a great way to manage library dependencies in containerized apps

Tipewryter · on July 5, 2021

Why would I need poetry? Doesn't "pip3 install -r requirements.txt" do everything I need?

bpicolo · on July 5, 2021

Pip is fine, it depends on your goals. I've found requirements.txt less enjoyable to maintain for several reasons – you need to separate dev, test dependencies on your own time, there's no notion of a lockfile for transitive dependencies (`pip freeze` notably doesn't separate actual dependencies from transitive dependencies). pip is also darn slow at installing dependencies once you hit a certain scale, and poetry outperforms it pretty substantially.

Poetry does I expect a package manager to do, and does it well, especially when working with a team of developers on an application versus individually. There's not a compelling reason for me to use pip directly as a less functional alternative.

Tipewryter · on July 5, 2021

Additional requirement files for dev and test don't seem like a burden to me.

Can you describe an issue that you had by not locking transitive dependencies?

bpicolo · on July 5, 2021

Bit rot, "it works on my machine"-style issues, cache misses on dependency installation (which can really bloat deploy times in deploy pipelines by busting Docker caches across machines, too). Can be a security issue if a vulnerable library version is pushed and one installs it as a consequence of having non-locked dependencies, especially in python where package install scripts have a lot of power.

Lock files help solve for these. You can build software without solving them, but it makes my life easier.

drcongo · on July 5, 2021

All of this. Plus picking up a legacy project from someone with a giant requirements file and then trying to pick through and work out what we actually want locked and what's been installed by something deep in a dependency tree is a nightmare. Even if you don't use poetry for your own sake, use it for everyone else's.

l0b0 · on July 6, 2021

Good question! From a template repo commit at work[1]:

Advantages:

- Separates development and production dependencies.

- The dependency version is specified separately from the lock file. In practice this means that the version in pyproject.toml generally only needs to be set to anything other than asterisk if and when it becomes necessary to use a specific version range.

- The lock file includes SHA-256 checksums by default, and these are checked during installation.

Disadvantages:

- More complex configuration than Pip.

- Python package managers come and go, and this one is likely going to suffer the same fate eventually.

- Introduces poetry.toml simply to specify that the virtualenv should be in the project directory. The default is to put virtualenvs in ~/.poetry, which is a non-standard location and therefore might interfere with typical IDE setups, mounting the virtualenv in containers or VMs, and the like.

[1] https://github.com/linz/template-python-hello-world/pull/106...*

bluewalt · on July 6, 2021

> The dependency version is specified separately from the lock file.

That. The simple fact that a Pip file mixes both the packages you want and the dependencies required by this package, is a valid reason to switch to Poetry IMO.

Tenoke · on July 5, 2021

Yeah, I don't think I've created a single venv in the last 2 years. I don't need them for basic stuff (e.g. a quick script) and for anything more I'd rather have a container so I can deal with all dependencies in one place and use it elsewhere quicker if I need to.

ed_balls · on July 5, 2021

How do you make sure your dependencies are not tampered with? https://medium.com/@alex.birsan/dependency-confusion-4a5d60f...

Tipewryter · on July 5, 2021

That is a very broad question. Can you mention a specific attack vector? Then I might be able to explain how I do or do not avoid it.

ed_balls · on July 6, 2021

The link describes the attack vector. pipenv locks the dependencies using hash. if you company has my-company-py-lib then pip could install public library that pretends to be internal.

0x008 · on July 5, 2021

yes, you can set PYTHONDONTWRITEBYTECODE=1 in your environment but it is equivalent to -B, I think.

willis936 · on July 5, 2021

If you need a specific OS then isn't a VM the solution rather than a container?

Also at what point do people just realize that all of this overhead is a gigantic waste of time and just use a better language?

Tipewryter · on July 5, 2021

A Docker container starts in two seconds or so. And gives me everything I need. So no need to dabble with a VM.

There is not much overhead in running a project in a container. The project has a setup file that turns a fresh Debian 10 into whatever environment it needs. And thats it. Run that setup script in your Dockerfile to create a container and you are all set. Want to run the project in a VM or on bare metal? Just install Debian 10, run the setup script and you all set.

evgen · on July 5, 2021

> Also at what point do people just realize that all of this overhead is a gigantic waste of time and just use a better language?

Probably some time shortly after your developer time costs less than your cloud compute time. Until you hit that point (if ever) there are few options as cost-effective as Python.

Spivak · on July 5, 2021

In any language ever if you use non-vendored shared libs you will hit this problem. Certainly not specific to Python, in fact the reason package managers on *nix are necessary (and not just a nice to have) is because of this.

skohan · on July 5, 2021

Yeah to be honest if you need containers to make a project reproducible this is just a sign of failure. You're basically saying you need to encapsulate the entire system for your code to run correctly.

cpfohl · on July 5, 2021

Loads of tools have external dependencies that are hard dependencies. And library search paths is my machine vary from those on prod... I could go on, but I'm not sure I understand why using a container to manage all that is a failure?

skohan · on July 5, 2021

There are plenty of reasons why you might want to containerize a project. If you have a lot of system dependencies, for example, you might want to consider including a Dockerfile in your project to make it portable.

However what makes Python a failure is that people feel they need this to dependably run a python program which only has pure-python dependencies.

Compare this to a language like Rust, or the NPM ecosystem. In those cases, the tools have managed to dependably encapsulate projects such that you only need the package manager to make a project fully repeatable.

With either of those ecosystems, there's basically one system dependency, and you can find any repository online and dependably do `git clone ...` then `cargo build` etc. to make it work. With Python, you effectively have to reproduce the original developer's system, and that is a failure.

Spivak · on July 5, 2021

Huh? Either something is really weird about your env or we have different ideas about what counts as a pure Python package.

Because if you don’t rely on Python packages with extensions that farm out to external libs it’s as easy as git clone, pyenv virtualenv, pip install -r, and python -m build.

skohan · on July 5, 2021

The think that makes this worse than other ecosystems is:

1. virtualenv shouldn't be necessary. This is more or less the same concept as containerization. This is only needed because python has a fractured ecosystem, and setting up your environment for one project can break another.

2. you also have to know which environment encapsulation and package management solution the library author is using - this is not standardized

zo1 · on July 5, 2021

1. Virtualenv is essentially the same as node_modules, yet everyone rants and raves and loves that. And the kind of breakage you're talking about is astronomically rare in my experience. 2. No you don't - what makes you say that?

skohan · on July 5, 2021

virtualenv is so much less user friendly than npm. Like why do I have to run a `source` command to make virtualenv work? I don't use either often, but I can remember how to use npm if I haven't used it in like 6 months, but I have to look up the right commands virtualenv if I haven't used it for like 2 weeks.

Daishiman · on July 5, 2021

If you want to run multiple versions of node you're going to need to use some sort of version manager too.

hazbot · on July 5, 2021

Every time I see one of these threads I think "maybe Python isn't the best choice if we still need threads like this". But it's just so useful...

yvan · on July 5, 2021

In the article the author mention that VS Code doesn't support Poetry venv. This can be solve by configuring Poetry to create the virtualenvironment directly in the project folder (which in my opinion is neater).

Here is the config https://python-poetry.org/docs/configuration/#virtualenvsin-...

anewhnaccount2 · on July 5, 2021

You can also just create .venv as a symlink.

EamonnMR · on July 5, 2021

One thing I'd add herr here is Pydantic. It compliments the type hinting very nicely and lets you get more static with your classes. But that you don't just install, you need to use. Which is another omission, I suppose; python's type hints are opt in, so mypy won't do you any good unless you actually add type hints, right?

daviddavis · on July 5, 2021

There are a bunch of mypy configuration options to disallow certain things like --disallow-untyped-defs. There’s also --strict which forces typing on everything.

kortex · on July 5, 2021

Man, developing python with pydantic and "mypy --strict" (I follow pydantic's config [0] where I can) is such Type 2 Fun. It feels like a totally different language. Yeah it takes a little more time at first but then type inference and autocomplete starts to kick in and then you're screaming fast. And you "compile" it and everything just works. No hunting down edge cases or tracebacks cause you forgot to catch a None. I find it super satisfying. Much easier to stay in flow state when you aren't having to stop every few minutes to test stuff and dig through tracebacks.

[0] https://github.com/samuelcolvin/pydantic/blob/master/setup.c...

pizza · on July 5, 2021

Awesome article but damn if that isn't some cognitive overhead just to get started (for newbies, at least)

rbanffy · on July 5, 2021

The pain and suffering I see with different versions of Python installed with Pyenv in my team and surrounding ones is huge. I advise anyone to just stay away from it. On Macs, you can install MacPorts and it'll bring in every version of Python, from ancient to newest, in a sane BSD-like way (not in a user-writable folder, ffs). On Linux the default package managers offer pretty much every version of Python that's not outright dangerous to run. On Windows there are binary installers and every Python goes in a separate folder.

Not only it's not needed, it creates a layer of "magic" the user has to understand on top of the environment.

And, if you really want to use brew, you can still continue using it, knowing it only has python 3.9 at the moment. Plus, it can coexist mostly peacefully with Macports.

mdaniel · on July 5, 2021

> knowing it only has python 3.9 at the moment.

Is incorrect, they just move older versions to their "@" nomenclature, for example https://formulae.brew.sh/formula/python@3.7

rbanffy · on July 5, 2021

Interesting. Thank you for correcting me. Can you install multiple versions this way? My colleagues are suffering with that and I'd prefer not to move them from Homebrew if avoidable.

mdaniel · on July 5, 2021

For sure, yes, they're distinctly named formulae and thus subject to their own install, remove, and version bump lifecycle management

IIRC the trick is that the "at" versions don't get "brew link"-ed by default, since they'd almost certainly smash on top of the non-at binaries or manpages or whatever. Using `PATH=$(brew --prefix "python@3.7")/bin:$PATH` or ones favorite context switching gizmo will help, or (with the python ones specifically) using `$(brew --prefix "python@3.7")/bin/python -m venv ...` is a great way to avoid having a lot of special env-var silliness

gillesjacobs · on July 5, 2021

Great recommendations! Exactly my setup, except add pipx for global system tools, i.e. command line tools distributed through pypy (youtube-dl, black, etc.).

I also add Docker or Songularity containers as needed for deep learning deployment but that is quite computing platform and application specific.

blooalien · on July 5, 2021

Love pipx. Great tool. I personally use that and poetry both quite heavily.

bobuk · on July 5, 2021

probably you've meant pypi not pypy

gillesjacobs · on July 5, 2021

Yes, indeed I mean the Python Package Index (PyPI). Mypy, pypy, pypi, pyenv, pipenv, pip, pipx. Easily confusable names are a core tenet of the Python ecosystem!

edumucelli · on July 5, 2021

That is a great write-up! One extra bit I'd recommend to this list is using https://github.com/ikamensh/flynt to convert string format into f-strings. It requires Python 3.6.

stult · on July 5, 2021

I use this exact set up and it’s great. Only thing I would add is using cookiecutter, which makes adding all these bells and whistles extremely easy.

nxpnsv · on July 5, 2021

Very close to my setup actually, there is a lot of good tips in this article.

Spivak · on July 5, 2021

Agreed, this is a pretty solid setup that will get you 90% of the way there for any Python project.

jtdev · on July 5, 2021

venv + requirements.txt is simple and works. Why are people so obsessed with replacing this simple toolset/process with poorly maintained tools like Poetry/PyEnv?

philote · on July 5, 2021

Why do you think those tools are poorly maintained? How do you handle locking in specific versions of libraries and keeping those properly maintained?

nomdep · on July 5, 2021

With pip-tools.

https://github.com/jazzband/pip-tools

And you can still use standard setup.cuff and pip install -e unlike Poetry. Also, much faster.

jtdev · on July 5, 2021

990 open issues… bugs going back to 2018 still open… https://github.com/python-poetry/poetry/issues

Please help me understand why I would want to lock specific versions of libraries in my Python projects. I’ve worked extensively with Python for a number of years and have never needed a lock file for my dependencies.

aunty_helen · on July 5, 2021

There should be an article like this for every language, as the author mentions it's not just syntax to change a language. Much of it is experience based knowledge.

However, this is how you would setup for a new project only and in a piecemeal manner, if you didn't already have some existing template.

Compatibility issues with your main dev tool would rule out using Poetry, in the same way running the latest versions of any software without reason is a rookie mistake. Chasing a higher version number is a jr / intermediate folley.

If you're developing python professionally, save your money and just pay for pycharm. The morning of fluffing with a dozen new tools which then have their own maintanence overheads is less cost effective than buying a product that gives you >70% of what the author has recommended and it does so in a generally consistent manner.

This blog post was a great reminder of how much pycharm gives me on a day to day basis, how people get lost in their +1 more tool mindset and how switching languages is an intial cognitive overload.

Donckele · on July 5, 2021

Please remove “Best Practices” and year from the title. Completely irrelevant and misleading.

codethief · on July 5, 2021

Why? The year in particular seems very relevant because the "recommended Python project setup" (if there ever was such a thing) changes every other year or so.

Donckele · on July 6, 2021

Well at least next year it will be officially out of date.

plesiv · on July 5, 2021

You should expect that "it's my opinion" is implied for these kind of blog posts.

Donckele · on July 6, 2021

Maybe we should have a prefix “My Opinion” for those easily confused.

l0b0 · on July 5, 2021

The current work project[1] has all of these, with one substitution: Pyenv, Poetry, Pytest, pytest-cov with 100% branch coverage, pre-commit, Pylint rather than Flake8, Black, mypy (with a stricter configuration than recommended here), and finally isort. These are all super helpful.

There's also a simpler template repo[2] with almost all of these.

[1] https://github.com/linz/geostore/

[2] https://github.com/linz/template-python-hello-world

bluewalt · on July 6, 2021

I replaced all the pyenv (and any version management in my projects) by the awesome VS Code remote development with containers. This is a game changer to me. 1 container = 1 clean installation, completely isolated, and you can share with your teammates.

Then I disable the use of virtual environments in Poetry, because it's a useless overhead.

Then for updates, I just change versions in my Dockerfile or .toml and rebuild the container from scratch in a few time, which is cleaner than manual updates for everything IMO.

claytonjy · on July 5, 2021

Why use the external pre-commit repos instead of pointing to the poetry-managed local installations for flake8, black, isort, etc? The repos approach results in a much simpler and cleaner pre-commit config, but since you end up with two pinned versions of each, managed by different updaters, it can invite drift which can be very hard to track down.

I'd prefer to only use the pre-commit versions of these libs, but then I'd sacrifice editor integration.