* -z instead of actually checking how many arguments you got passed and trusting the end user if they do something weird like pass an empty string to your program
* echo instead of printf
* `print_and_execute sdk install java $DEFAULT_JAVA_VERSION` who asked you to install things?
* `grep -h "^sdk use" "./prepare_$fork.sh" | cut -d' ' -f4 | while read -r version; do` You're seriously grepping shell scripts to determine what things you should install?
* Unquoted variables all over the place.
* Not using mktemp to hold all the temporary files and an exit trap to make sure they're cleaned up in most cases.
I think Python is overused, but this is exactly what Python is great for. Python3 is already installed or trivial to install on almost everything, it has an enormous library of built-ins for nearly everything you'll need to do in a script like this, and for all of its faults it has a syntax that's usually pretty hard to subtly screw up in ways that will only bite you a month or two down the road.
My general rule of thumb is that bash is fine when the equivalent Python would mostly be a whole bunch of `subprocess.run` commands. But as soon as you're trying to do a bunch of logic and you're reaching for functions and conditionals and cases... just break out Python.
I've been pretty happy with the experience of using Python as a replacement for my previous solutions of .PHONY-heavy Makefiles and the occasional 1-line wrapper batch file or shell script. It's a bit more verbose, and I do roll my eyes a bit occasionally at stuff like this:
call([options.cmake_path,'-G','Visual Studio 16','-A','x64','-S','.','-B',build_folder],check=True)
But in exchange, I never have to think about the quoting! - and, just as you say, any logic is made much more straightforward. I've got better error-checking, and there are some creature comforts for interactive use such as a --help page (thanks, argparse!) and some extra checks for destructive actions.
Golang. You build one fat binary per platform and generally don't need to worry about things like dependency bundling or setting up unit tests (for the most part it's done for you).
I use different languages for different purposes. Although bash euns everywhere, its a walking footgun and thus I only use it for small sub 100 line no or one option Scripts.
the rest goes to one of Python, which nowadays runs almost everywhere, Julia or a compiled language for the larger stuff
If you just want to move some files around and do basic text substitution, turning to Python or another other "full fledged programming language" is a mistake. There is so much boiler plate involved just to do something simple like rename a file.
I have a lot of scripts that started as me automating/documenting a manual process I would have executed interactively. The script format is more amenable to putting up guardrails. A few even did get complex enough that I either rewrote them from the ground up or translated them to a different language.
For me, the "line in the sand" is not so much whether something is "safer" in a different language. I often find this to be a bit of a straw-man that stands in for skill issues - though I won't argue that shell does have a deceptively higher barrier to entry. For me, it is whether or not I find myself wanting to write a more robust test suite, since that might be easier to accomplish with Ginkgo or pytest or `#include <yourFavorateTestLibrary.h>`.
Is it really so bad? A bit more verbose but also more readable, can be plenty short and sweet for me. I probably wouldn't even choose Python here myself and it's the kind of thing shell scripting is tailor-made for, but I'd at least be more comfortable maintaining or extending this version over that:
from subprocess import Popen, PIPE
CMD = ("printf", "x:hello:67:ugly!\nyy$:bye:5:ugly.\n")
OUT = "something.report"
ERR = "err.log"
def beautify(str_bytes):
return str_bytes.decode().replace("ugly", "beautiful")
def filter(str, \*index):
parts = str.split(":")
return " ".join([parts[i-1] for i in index])
with open(OUT, "w") as out, open(ERR, "w") as err:
proc = Popen(CMD, stdout=PIPE, stderr=err)
for line_bytes in proc.stdout:
out.write(filter(beautify(line_bytes), 2, 4))
I would agree though if this is a one-off need where you have a specific dataset to chop up and aren't concerned with recreating or tweaking the process bash can likely get it done faster.
Edit: this is proving very difficult to format on mobile, sorry if it's not perfect.
That way, if something is easier in Ruby you do it in ruby, if something is easier in shell, you can just pull its output into a variable.. I avoid 99% of shell scripting this way.
But if all I need to do is generate the report I proposed...why would I embed that in a Ruby script (or a Python script, or a Perl script, etc.) when I could just use a bash script?
Bash scripts tend to grow to check on file presence, conditionally run commands based on the results of other commands, or loop through arrays. When it is a nice pipelined command, yes, bash is simpler, but once the script grows to have conditions, loops, and non-string data types, bash drifts into unreadability.
I don’t think it’s fair to compare a workflow that is designed for sed/awk. It’s about 10 lines of python to run my command and capture stdout/stderr - the benefit of which is that I can actually read it. What happens if you want to retry a line if it fails?
> I don’t think it’s fair to compare a workflow that is designed for sed/awk.
If your position is that we should not be writing bash but instead Python, then yes, it is absolutely fair.
> the benefit of which is that I can actually read it.
And you couldn't read the command pipeline I put together?
> What happens if you want to retry a line if it fails?
Put the thing you want to do in a function, execute it on a line, if the sub-shell returns a failure status, execute it again. It isn't like bash does not have if-statements or while-loops.
My point is that if you take a snippet designed to be terse in bash, it’s an unfair advantage to bash. There are dozens of countless examples in python which will show the opposite
> And you couldn't read the command pipeline I put together?
It took me multiple goes, but the equivalent in python I can understand in one go.
> Put the thing you want to do in a function, execute it on a line, if the sub-shell returns a failure status, execute it again. It isn't like bash does not have if-statements or while-loops.
But when you do that, it all of a sudden looks a lot more like the python code
I have not really been a fan of ChatGPT quality. But even if that were not an issue, it is kinda hard to ask ChatGPT to write a script and a test suite for something that falls under export control and/or ITAR, or even just plain old commercial restrictions.
XONSH is a Python-powered shell
Xonsh is a modern, full-featured and cross-platform shell. The language is a
superset of Python 3.6+ with additional shell primitives that you are used to
from Bash and IPython. It works on all major systems including Linux, OSX, and
Windows. Xonsh is meant for the daily use of experts and novices.
Haven't heard of it before personally, and it looks like it might be interesting to try out.
I stopped caring about POSIX shell when I ported the last bit of software off HP-UX, Sun OS, and AIX at work. All compute nodes have been running Linux for a good long while now.
What good is trading away the benefits of bash extensions just to run the script on a homogeneous cluster anyways?
The only remotely relevant alternative operating systems all have the ability to install a modern distribution of bash. Leave POSIX shell in the 1980s where it belongs.
Except that'll pick up an old (2006!) (unsupported, I'm guessing) version of bash (3.2.57) on my macbook rather than the useful version (5.2.26) installed by homebrew.
> -z instead of actually checking how many arguments you got
I think that's fine here, though? It's specifically wanting the first argument to be a non-empty string to be interpolated into a filename later. Allowing the user to pass an empty string for a name that has to be non-empty is nonsense in this situation.
> You're seriously grepping shell scripts to determine what things you should install?
How would you arrange it? You have a `prepare_X.sh` script which may need to activate a specific Java SDK (some of them don't) for the test in question and obviously that needs to be installed before the prepare script can be run. I suppose you could centralise it into a JSON file and extract it using something like `jq` but then you lose the "drop the files into the directory to be picked up" convenience (and probably get merge conflicts when two people add their own information to the same file...)
> Except that'll pick up an old (2006!) (unsupported, I'm guessing) version of bash (3.2.57) on my macbook rather than the useful version (5.2.26) installed by homebrew.
Could you change that by amending your $PATH so that you're preferred version is chosen ahead of the default?
I think the `#!/bin/bash` will always invoke that direct file without searching your $PATH. People say you can do `#!bash` to do a $PATH search but I've just tried that on macOS 15 and an Arch box running a 6.10.3 kernel and neither worked.
They're definitely both critiquing the script in the OP for the same thing in the same way. They're in agreement with each other, not with the script in TFA
The 1brc shell script uses `#!/bin/bash` instead of `#!/usr/bin/env bash`. Using `#!/usr/bin/env bash` is the only safe way to pick up a `bash` that’s in your $PATH before `/usr/bin`. (You could do `#! bash`, but that way lies madness.)
As far as quick and dirty scripts go, I wouldn’t care about most of the minor detail. It’s no different to something you’d slap together in Ruby, Python, or JS for a bit of automation.
It’s only when things are intended to be reused or have a more generic purpose as a tool that you need them to behave better and in a more standard way.
I had some similar thoughts when seeing the script.
For better user friendliness, I prefer to have the logging level determined by the value of a variable (e.g. LOG_LEVEL) and then the user can decide whether they want to see every single variable assignment or just a broad outline of what the script is doing.
I was taken back by the "print_and_execute" function - if you want to make a wrapper like that, then maybe a shorter name would be better? (Also, the use of "echo" sets off alarm bells).
Most of the time, "echo" works as you'd expect, but as it doesn't accept "--" to signify the end of options (which is worth using wherever you can in scripts), it'll have problems with variables that start with a dash as it'll interpret it as an option to "echo" instead.
It's a niche problem, but replacing it with "printf" is so much more flexible, useful and robust. (My favourite trick is using "printf" to also replace the "date" command).
This one becomes very apparent when using NixOS where /bin/bash doesn’t exist. The vast majority of bash scripts in the wild won’t run on NixOS out of the box.
* #!/bin/bash instead of #!/usr/bin/env bash
* [ instead of [[
* -z instead of actually checking how many arguments you got passed and trusting the end user if they do something weird like pass an empty string to your program
* echo instead of printf
* `print_and_execute sdk install java $DEFAULT_JAVA_VERSION` who asked you to install things?
* `grep -h "^sdk use" "./prepare_$fork.sh" | cut -d' ' -f4 | while read -r version; do` You're seriously grepping shell scripts to determine what things you should install?
* Unquoted variables all over the place.
* Not using mktemp to hold all the temporary files and an exit trap to make sure they're cleaned up in most cases.