Standard error stream is such an unfortunate name. It's not just for errors and diagnostics. Any non-output data should be sent there, especially messages the user is supposed to see. It should have been called standard user stream.
Both outputs are for the user, even if the user redirects them. Standard user stream doesn't seem like a differentiating name.
Standard error stream makes more sense when you consider the Unix philosophy[1]: "Don't clutter output with extraneous information." It's not just about stdout. The more you output, the more you obscure the important data, and the more likely the user is just going to ignore it. Ideally, there should only be regular output and errors. The names are a good guideline for that.
When you're talking about stuff like progress indicators, you're already breaking from the mold, just like TUIs also do. You have little choice but to break Unix conventions and guidelines then. TUIs like vim don't output errors to stderr, you know. They include them in stdout.
This doesn't mean that stderr isn't a good name. It fits very well for regular Unix utilities that stick to the guidelines of the Unix philosophy.
with another. That's fair enough, but what he doesn't mention is that running the tool and interpreting its output is
./runbenchmarks
with the second style of output and
./runbenchmarks
benchmarks --help
# damn, it just lists the command-line options
vim benchmarks.c
# look through source code to find what the random numbers after the program name mean
with the first. The culprit here is the fact that his preferred style not only puts each benchmark's output on one line but also omits the "Time:" and "Alloc:" and "ns/op" and "bytes/op" which make the numbers generated actually mean something to a human being.
I think the correct answer here may be to have a command-line flag selecting between two kinds of output, one intended for humans to read and one intended for programs to parse. Or maybe for the output to look like
fizzbuzz: 10 ns/op, 40 bytes/op
or
fizzbuzz 10 ns/op 40 bytes/op
either of which is pretty easy to parse for both humans and computers. Or even
fizzbuzz time 10 ns/op
fizzbuzz alloc 40 ns/op
which lets you see all the results for the fizzbuzz benchmark with the same grep as above, and all benchmarks' time results with another almost-as-simple grep, at the cost of a little redundancy in the output.
Higher-level message: when you have two competing requirements (make things readable for humans, and make things parseable for programs), before just picking one as The One That Matters consider whether maybe there's a way to get both.
Yeah this was also my complaint. Personally I’d say just output the header. That’s what `tail -n +2` or whatever is for.
Another option is to detect when being piped to another program and not print the header in those cases, similar to how many programs do color, or print the headers to stderr
These are still good guidelines. I would propose some additional guidelines, that as a tool developer myself, will make the tool more accessible and useful.
Provide custom format options, such as --format-json or format--<x> to produce output in JSON or other popular formats.
Implement both short and long options (e.g., -f/--filename) consistent with other command line tools.
Implement a --verbose and/or --debug option to enable more detailed output when needed for troubleshooting.
Provide a --version option to display the tool's version and then exit.
Provide a --help option to display program usage and options and then exit.
Provide useful error messages that at minimum inform the user what went wrong when the program aborted.
As a corollary to the previous guideline, output noisy error output like stack traces, etc. when --verbose is used and an error is encountered.
> Output should be free from headers or other decoration.
One way to solve this nicely, that seems to be getting more common, is to use `isatty()` to check if the output is a terminal and if so print with decorations, otherwise leave them away.
`ls` for example will output unprintable characters, even just space, in quoted form on a terminal:
$ touch 'foo bar'
$ ls
'foo bar'
But when redirected, it will output the raw value:
> One way to solve this nicely, that seems to be getting more common, is to use `isatty()` to check if the output is a terminal and if so print with decorations, otherwise leave them away.
This is not a good idea because it can lead to surprising the user (ls's behavior is actually bad from that perspective). For example you run a program
$ foo
ID Thing What
4 Cat Mews
2 Dog Woofs
5 Canary Tweets
...then you run the result through sort and trying to avoid the header...
$ foo | tail -n +2 | sort
...except instead of the expected result you get...
2 Dog Woofs
5 Canary Tweets
...because the program tried to be smart instead of consistent. This is also against the GNU guidelines as mentioned elsewhere.
The little surprise is worth the general improvement in usability (e.g. colors, progress bars, filenames you can copy&paste, terminal not getting corrupted by escape sequences, etc). It also makes it clear that the terminal output is for user interaction, so programs no longer have to be both UI and API at the same time, they can focus on one or the other, making both much better and cleaner as a result.
> ...then you run the result through sort and trying to avoid the header...
The much more common scenario would be doing `foo | sort` and then ending up with random header text in the sorted data. Few people will add a `tail` the first time they type that command or remember do it every time they use it interactively. With `isatty()` it behaves as the user expects it right from the start.
> `ls` for example will output unprintable characters, even just space, in quoted form on a terminal:
Pedantic quibble GNU does this for special characters, as in your example the space is very much printable. Specifically this became default with coreutils 8.25 https://www.gnu.org/software/coreutils/quotes.html
I noticed that last week and was quite surprised by it.
I was writing a script to get the permissions of all files. I wrote something like ls -l | grep -oE '^[^ ]+' and ran it in my home directory for testing. And then I was surprised that the output was wrong. Turned out I had files with \n in their name there and ls was printing them on two lines which confused the grep. (I still used that script, since I did not have any \n files on the real system)
I was actually building an exam for a course involving shell scripting. A common question was, do something with all the files in the current directory, like grep them or delete them. The lecture notes said to use * for all those files, but then I realized rm * would not work in all possible cases. I spend like an hour to find a hopefully correct solution. However, the professor said, the students would never figure it out in an exam, and I should just put * as model solution. The shells is extremely brittle
When dealing with filenames one has to get used to always using '\0' separated output instead of newline separated output, as filenames in Linux can contain everything except '\0' and '/'. Luckily most tools are prepared for this and have options to either output '\0' separators or accept them as input.
Dealing with all the files in the current directory would look something like this (for demonstration, can be made shorter by using '-name' or '-regex' option from 'find'):
On a related note, there was an article/website that talked about how to design the ux of a cli tool properly. E.g. how to design the arguments among other things.
I've been struggling to find it again. Does anyone remember what the article/site was called?
Between every program being modified to work around your old shell or you switching to a modern shell like everyone else, which one do you think sounds more reasonable?
You can do this straight from the shell without needing every program to be modified.
I understand there are less powerful shells, as you list, but if for some reason you can’t use a more modern shell, aren’t you even less likely to be able to install updated apps?
The GNU Coding Standards recommends not doing that:
“Likewise, please don’t make the behavior of a command-line program depend on the type of output device it gets as standard output or standard input. Device independence is an important principle of the system’s design; do not compromise it merely to save someone from typing an option now and then. (Variation in error message syntax when using a terminal is ok, because that is a side issue that people do not depend on.)
If you think one behavior is most useful when the output is to a terminal, and another is most useful when the output is a file or a pipe, then it is usually best to make the default behavior the one that is useful with output to a terminal, and have an option for the other behavior. You can also build two different versions of the program with different names.
There is an exception for programs whose output in certain cases is binary data. Sending such output to a terminal is useless and can cause trouble. If such a program normally sends its output to stdout, it should detect, in these cases, when the output is a terminal and give an error message instead. The -f option should override this exception, thus permitting the output to go to the terminal.
Compatibility requires certain programs to depend on the type of output device. It would be disastrous if ls or sh did not do so in the way all users expect. In some of these cases, we supplement the program with a preferred alternate version that does not depend on the output device type. For example, we provide a dir program much like ls except that its default output format is always multi-column format.”
Interesting that a major GNU util (ls) does exactly the opposite and prints differently (multiple entries on a line vs one line per entry) in terminal vs a pipe.
I wouldn’t say that this is terrible advice, just naive and limited. The only thing I almost completely agree with to allow your program to be a filter; my disagreement comes from the fact that all pipelines need a starting point. ls is a good example. The one thing I agree with completely is return code, which is especially useful when combined with a -q option (cf grep, below).
Headers are useful for humans. Don’t want them? Have -H/+H options, with the default based on whether you will be outputting most often to a human or a filter.
Space-separated output makes sense IFF fields will NOT contain spaces. Not sure? Have a -d option, like cut does, to allow the user to specify the separator.
Verbosity can be wonderful and wonderfully bad. Consider having -v, possibly multiple -v’s, like ssh, and -q, like grep, to control the exact level.
In other words, don’t take simplistic advice, certainly not this advice. Examine the behaviour of flexible commands like grep and cut and tr and determine for yourself which options are best suited to your program.
Re interactivity: if a program is used infrequently, interactivity can be good. No argv[1]? Prompt the user.
My build scripts are completely automatic, but they are run frequently (sometimes multiple times a day) by many people. Over time, we’ve gotten a pretty good handle on what we need them to do.
My addlabel and makeiso scripts, OTOH, prompt with reasonable defaults because they are run far less often and use commands that are less familiar.
Consider first the needs of the users, and do not assume they know as much as you. Or as little.
This is really spot on! Thanks for summarising it all. Especially, I know that jq is almost ubiquitous, until it's not (for example: not in the busybox, nor alpine default docker images). So please: avoid JSON, or at least provide an option to choose the output format and support an alternative to JSON, YAML, whatever.
I like JSON in that it's structually reasonably stable; even under hostile content, it will stay the same. That can't be said of other formatting, that might get confused when newline, spaces, nulls and other "hostile" separators are used.
That's specifically one of the reasons why you never can reliably parse ls in even remotely potentially not pristine and safe environments.