It is sad that many new command-line parsing libraries don't follow the GNU rules anymore. They more often use "-long". Then users have to figure out whether this means "--long" or "-l -o -n -g". To make command line even more confusing, multiple tools I have used allow spaces in optional arguments (e.g. "-opt1 arg1 arg2 -opt2", where arg1 and arg2 set two values for -opt1). Every time I see this, I worry if I could be misusing these tools. I wish everyone could just follow getopt_long() and stop inventing their own weird syntax.
Edit: To be clear, I'm mostly "blaming" Go for re-popularizing this style by a) putting it in the standard library and b) being a widely used programming language; I'm not saying Go came up with this or anything.
(idk about the space separated args tho that's even worse)
Cobra (https://github.com/spf13/cobra), which is a pretty popular library for Go CLI applications, behaves more like classical GNU tools. It also offers usage/help autogeneration and autocompletion for popular shells.
Not sure about the relevant point on compact short options syntax as in `tar -xvzf archive.tgz` though...
(edit) after a quick & sloppy test it seems to work as expected
Yeah but `tar xvzf archive.tgz` also works so I remain wary of tar. Basically every time I have tried to do something that's not tfz or cfz or xfz, it went wrong until I checked the manpage.
Right. I probably picked one of the most flaky examples, sorry for that. Let's say `ls -lah` that (I hope...) is less ambiguous.
In my defense, the specific example I gave is valid for both GNU and BSD versions of tar. If I understood correctly, the issue you point to (order among short form flags) is related to the fact that `f` expects an argument and consequently has to appear in the last position.
Ah, it's not a direct argument when you omit the hyphen and fall into "traditional" mode. I think after years and years I can finally wrap my head around how that works. :D
You don't "have to" use the examples, you can read them as get a feel, and read the captions to find the one that does what you want...
Which is faster and probably safer than scanning the documentation for individual flags and hopping you got the nuances right...
See, the two cases aren't:
(1) Thoroughly study man page -> (2) Become expert at the command's options (3) try command secure in your mastery of it
vs
(2) Check tldr examples -> (2) try command
They're rather:
(1) Open man page, (2) scan and skim the man page and the dozens of irrelevant flags, caveats, and obscure options, until you find some flags that look to do what you want, (3) half-read them, (4) try command
vs
(2) Check tldr examples, (2) find an example that does what you want (which is usually one of the covered use cases) (3) try the command using the example syntax
I'm generally satisfied by ZSH inline options summary, but I'm happy to see a sane instantiation of this, it clearly fits a need. Thanks for the pointer (and sorry for the troll :/).
Google's command line flags library, known to the public as absl::Flags and formerly gflags, does not distinguish between --foo and -foo, these are both the flag "foo". Each flag has a unique name so there is never a short -f equivalent to --foo, and -foo can never mean -f -o -o.
The main design motivation of absl::Flags is that the flag definitions can appear in any module, not just main. Go inherits this. A quirk that Go did not inherit is gflags --nofoo alternate form of --foo=false.
This is all documented at https://gflags.github.io/gflags/#commandline, which is pretty much a verbatim export of the flags package documentation that a Google engineer would see internally.
> The main design motivation of absl::Flags is that the flag definitions can appear in any module, not just main.
Well that's kind of horrifying. That means that command-line arguments are a form of global state, and can silently alter the behavior of the program without the calling scope noticing.
I'm kind of vary of these mechanisms, because I've been bitten by them before. There was a python library I used that read its configuration from sys.argv the first time an object from the library was constructed. I had a rather painful time debugging to find that my script accepting a -b argument resulted in the library switching to batch mode and suppressing all graphics. Dang it, those were my arguments, and the library had no right to go behind my back and look at arguments that hadn't been directly provided to it!
If you think that's horrifying, what if I told you that a sufficiently-entitled operator of a given program can alter the flags at runtime ... using their web browser. https://twitter.com/jbeda/status/888635505201471490
Oh my. I have a gut feeling that I don't like it one bit, though I tend to be a bit more generous on logging. Logging is one of the only cases where its presence or absence don't change the inputs or outputs of any function, nor any other observable effect of the program. Having or removing logs doesn't impact the testability of a function, unlike any other use of global configuration.
You seem like a pretty reasonable person so prepare to be more shocked :-) In a glog stream like this, the things on the right side are not evaluated unless verbosity is on.
I have on occasion been called a reasonable person, and good heavens! I could understand that in a functional language with lazy evaluation, but that doesn't fit at all with my mental model of how C++ works. It can't be a macro, because the VLOG parentheses would need to enclose the entire expression. It can't just be the normal operator<< , because then the expression would always be evaluated. I suppose expression_with_side_effects() could return an object that is implicitly convertible to string, and the actual side effects happen in that optional conversion, but that would require lots of cooperation from the user.
I'm almost scared to ask. How is that even implemented?
Go was designed by former Bell Labs people who worked on Unix, Plan9, or both. many things about Go that people attribute to "googlism" is really attributable to work done at Bell Labs.
In my experience at Google, only Go does flags like this. Everything else (python, java, c++, blaze) all use the same flag syntax, which is all via long args with two dashes.
The Java ecosystem has historically used single-dash options, both the SDK tooling (e.g. `java -jar`, `javac -classpath`) and classic common libraries like Jakarta Commons CLI. It has moved away from it more in recent years so now you get a mishmash of single and double dashes depending on how old the option is. In some cases you end up with stuff like `java -showversion` which prints the version to stderr but ` java --show-version` which prints to stdout.
I have seen a mix. For example, many Android developer tools (not written in Go) use this single-dash style. I believe the standard libraries used for parsing in internal tools mostly support both syntaxes, although some docs do describe the old single-dash style by default.
TBH I have no idea; I've heard of Fuchsia, but know nothing about it. It seems pretty far removed from the majority of work I've done in Google3 (the monorepo).
>many things about Go that people attribute to "googlism" is really attributable to work done at Bell Labs.
We're 50 to 30+ years away from that Bell Labs work. They could have checked what happened in the meantime with the rest of the computing world, before re-imposing obsolete ways with the full power of Google behind them...
It predates golang significantly. C and C++ bioinformatics tools have used single dash long opts since the 1990s, unfortunately. I expect the transgression didn't originate in the bioinformatics community.
Single-dash long options are not started in Bioinformatics, but they are more often used in this field than elsewhere. Perhaps that is partly because some of the most popular tools (e.g. blast, muscle, bedtools and gatk) followed this unfortunate convention.
One thing go's flag's package does that deserves a lot of blame is to automatically sort the flags alphabetically when looking at -help. And the fact that you need to hack your away around it instead of there being simply an option like nosort=true or whatever is even worse. The whole idea is crazy and basically equivalent to the statement that there order of parameters in -help serves no useful purpose.
And yet, I expect flags to be sorted in a man page; I rarely read things in a logical order, I'm just looking into what flag does what.
It's a convention-over-configuration thing I think. I mean they set a standard, so you can move on. The alternative is to sit and think and discuss about what order to put your documentation in.
You read text from top to bottom. Chances are that you're writing help text and describing the most commonly used flags at the top, and the more obscure ones lower down.
So you read the whole man page when you need a flag that does something specific or do you mean you never write new things and just have to look up flags already in use by some script?
Because for everything else that seems like a fascinating waste of time.
Because you don't always know which words the man page uses to describe specific functionality. So many ways to express similar ideas, language is fun that way.
Thankfully we have git.sr.ht/~sircmpwn/getopt github.com/pborman/getopt github.com/mattn/go-getopt rsc.io/getopt and a hundred more, but I really wish getopt was a part of the standard library.
I think Go's package "flag" was partially inspired by
the one made by Apache for Java, but I can't find any sources
confirming that now, so I might have seen that in a dream,
heh.
However, even those seem to raise some questions, for example:
> To make command line even more confusing, multiple tools I have used allow spaces in optional arguments (e.g. "-opt1 arg1 arg2 -opt2", where arg1 and arg2 set two values for -opt1).
Is described as something that's permissible:
> An option and its argument may or may not appear as separate tokens. (In other words, the whitespace separating them is optional.) Thus, ‘-o foo’ and ‘-ofoo’ are equivalent.
Therefore the below would be considered equivalent:
-opt1 arg1 arg2 -opt2
-opt1arg1arg2 -opt2
Were your expectations different?
Are there any good articles on the benefits of following such rules (any fungible improvements to legibility or usability, as opposed to just "consistency amongst different tools")?
Are there any tools which can validate whether any piece of software conforms to this standard (either by scanning the man pages, or the code, or a formalized format of parameters the app supports)? Personally, the closest i've found is Typer ( https://typer.tiangolo.com/ ) but without anything that can automatically reject non-conformant code as a part of a CI process, i think enforcing such formats would be a non-starter for me.
The point isn’t to use getopt with all its complexity, the point is that two dashes for long options is already extremely well established and Go popularizing long options with a single dash is very much a regression. It creates a lot of unnecessary confusion when they really should have known better and just stuck with the conventions.
In your regex at least, removing the confusion is simple as adding another ‘-‘, and now you’re in compliance with the expectations of almost every IT person in the world who uses Unix command lines.
But that is a bad convention, it prioritizes typing speed over readability, and it only works in certain cases (for flag-only parameters). I'd say good riddance to it.
That doesn't justify long options starting with a single dash, as one could have made every option start with two dashes. Sure, `--` is longer than `-`, but typing speed shouldn't matter right?
Sure, but the only reason to add an extra dash is to differentiate --long from -l -o -n -g. No reason to just add extra characters if ypu don't need this differentiation. Not to mention, Go cmd line parsing actually accepts both -long and --long, if you find the -- version more aesthetically pleasing.
If we only did "the accepted convention" indefinitely there would be no going forward. I see this change(of being explicit) as a win.
The situation was already confusing before with different tools using different conventions. This way of being explicit allows you to be consistent across OS-s too. The world is not only GNU, fortunately.
GCC or anyone else's C compiler doesn't really count as far as this convention goes. GCC's flag parsing is aiming to be compatible with conventions that preceded GNU, other vendor's compilers break their own conventions to be compatible with GCC or whatever else cc(1) is etc.
GNU's conventions are generally complimentary, but not incompatible with POSIX. And POSIX specifies the behavior of the sort of flags cc(1) should understand[1].
There are many POSIX and other traditional *nix tools that are a convention unto themselves for historical reasons. E.g. notice how GNU "dd" doesn't follow normal GNU command-line conventions either.
-long = -l -o -n -g was probably a mistake. But then again, tar xvzf should have been called untar, so it’s not like there are a shortage of opinions and historical mistakes.
My question was about how "tar xvzf" should be called "untar".
Your reply might still make sense (i.e. untar could automagically figure it out), but I was highlighting how tar/untar today also means (de)compressing that tar archive using many different compression formats.
I'd like a word with the person who thought that regular ( ) parentheses for grouping were a good idea in the find syntax, requiring them to be backslash-escaped in shell scripts.
The obviously right choice would have been [ ]. You know, like in
The Amiga had a pretty cool feature where CLI argument parsing and help was provided via a library. This made things nicely consistent across almost all of the CLI tools.
Suddenly I'm reminded of how Windows represents the command line as a single string (PWSTR), and how entry points that expect argv-style are parsed by the CRT at startup.
vs. Unix where char *argv[] is what makes it to the syscall layer.
The result there is that command line processing is more consistent program-to-program on Unix. On Windows, every program could decide to tokenize the arguments differently.
I feel like there are a few interesting Microsoft phenomena that contrast with Unix thinking in both of these examples.
CommandLineToArgvW - You called that "WINAPI", but it's worth mentioning the more specific provenance of shlwapi.dll. This is not a core, foundational part of Windows that is used in core, foundational things. It's a helper function from the shell (explorer, not shell in the Unix sense). So, while it has a look and function that seems pretty foundational, it really isn't. It's there because somebody working on Explorer long ago found that useful to have and decided to export their helper function in the DLL.
CRT - A CRT binary ships with Windows, but really, that code is maintained and distributed by the compiler guys and DevDiv. So theoretically, the argv parser could change at those people's whim alongside a new Visual Studio release. And it seems from squinting at that github issue like that might have happened here.
So really ... there are more artifacts here attesting to the fact that the command line arg parser is not part of the operating system. People find that functionality useful, so they look for things that "look like" the operating system official method, and maybe they find stuff that does "look like it" -- but such a thing isn't really there.
I was not arguing that it was or was not part of the OS but just showing that the parsing being deferred to application code has produce two subtly incompatible implementations that differ for no reason other than that they do.
Yeah, I am not considering anything you say to be argumentative, I am just going in tangents with this topic because I have some experience there and find it interesting.
That's a good thing. You have to be careful using a command line SQL query when typing "SELECT ". If the processing is left to the program, an SQL app in Windows knows you didn't mean "" to mean all the files in the current folder.
There were also the Amiga style guides that were published with 2.0 that detailed how developers should build application user interfaces. The fragmentation in Linux/Unix distributions means that this kind of consistency is pretty much impossible, although FreeBSD does a much better job of being consistent than $majorlinuxdistros.
I love the "long options start with two dashes" convention. It means that you can chose short options that are easily combined (in cases where the command and its options are often used), or you can use long options that are much easier to understand (because they are full words). More command line tools should support them.
I typically use long options in shell scripts that will be checked in or shared with others. The self documenting nature of long opts is much nicer (imho) than the terseness of short ones.
I’m also glad short opts are available for my personal day-to-day work. I spend most of my time in a a terminal and appreciate having short-hand available.
It always confused me when tools don't follow that rule. Eg: "find" where
"find . --name '.dat'" won't work but "find . -name '.dat'" will and it's not the only one
`find` is weird anyway. The stuff after the arguments aren't really flags, they're a tiny filter language, with significant ordering and operator precedence and all that stuff. Using "normal" option syntax wouldn't make a lot of sense for it either.
find . -name '*.py' -a -executable -o -printf '%P\n'
can be read
(F.name matches '*.py') && (F is executable) || print('%P\n', F)
where F is the current node in the file system traversal.
They both respect -o as OR, ! as NOT, and ( ) for precedence, which you have to quote as \( and \).
A couple years ago, someone helped me implement a better "find" without this wonky syntax for https://www.oilshell.org/ . But it isn't done and needs some love. If anyone wants to help, feel free to join Zulip :)
I do think that "find" is more like a language than a command line tool. It's pretty powerful, e.g. I just used it to sort through 20 years of haphazard personal backups.
Thanks for the article! How did I not see this before?
Didn't know that POSIX has obsoleted -a and -o either.
I guess I have some shell scripts to rewrite, heh.
Well the thing find and test have in common is that they lack a lexer! They abuse the argv array for tokens instead. I might call it the "I'm too lazy to write a lexer" pattern :)
jq has a lexer and hence a "real" syntax, but so does awk, which is maybe 30 years older. But yes jq is a surprisingly big and rich language, maybe bigger than awk:
Find is about as user-unfriendly as a shell command could be. I never get it to do what I want on the first try. And its error messages are always cryptic and unhelpful.
I don't think any shell commands are particularly "friendly." Most are intentionally terse (in fact I find verbose, "friendly" command options to be annoying), and you learn them by repeated use, or for those that you use only occasinally, by consulting the man pages.
What does "illegal option" mean exactly? Why is it "n" which is the first letter of "-name"? Yes, it wants a path. Yes, even if you want to search in the current directory. Yes, it IS unusual, because all other commands that operate on directories, like `ls`, assume current directory if you don't specify any.
Why could it not just say "a path is required" instead?
It's saying that because it's using getopt to parse any initial option arguments. That diagnostic message is the standard default message printed by the getopt function whenever encountering an invalid option flag. It means all utilities using getopt will, unless you disable the default behavior, display the same initial diagnostic. It's idiomatic for utilities to then print a short usage message of its own.
Judging by the usage message you printed, you were almost certainly using a BSD implementation, probably on macOS, which in turn is probably sync'd from FreeBSD. `find -name something` will fail early in main. See https://github.com/freebsd/freebsd-src/blob/b422540/usr.bin/... When processing the 'n' in '-name' getopt() will return '?', which will end up calling usage().
The GNU implementation of find is completely different, though I'm not sure it does what you expect:
$ find -name something
That prints nothing and returns a successful exit code. But if you remove the "something" operand you get what I presume you were originally expecting as an error message:
Not rocket science, but as a programmer and maintainer which approach do you think makes more sense? Is trying to do the supposedly intuitive thing worth it, especially considering find's already arcane and irregular syntax? As an experienced command-line user I'd just be thankful that the option flags (as opposed to the filter directives) are parsed regularly.
This is a good explaination why it has the current behaviour, but it doesn't answer the question of why the behaviour isn't better (i.e. which would be to tell the user what's needed, the path, instead of telling the user what was provided is not what's needed which is vague and leaves it up to the user to figure it out.)
It's not like the source code is now etched into stone and can't be changed. Or is it?
GNU find, or at least my version of GNU find (4.8.0), will just assume "." if the path is missing, and will work as expected. I think various forms of BSD find are a bit more strict, and based on that usage message is seems to be BSD find.
It gave you the list of options (i think that's at most one of -H and friends, as many as you like of -E and friends, -f with an argument), and -n isn't one of them.
Several BSD commands are pickier than GNU commands about option order, sometimes for good reason, sometimes because it was easier to write that way.
This is why I've ultimately come to the conclusion that shells are for casual use only, not for any kind of serious work. There are too many implementation details, inconsistencies, and footguns to write anything that needs to be somewhat reliable.
To be fair, there is one shell that I think someday we could rely on. https://www.nushell.sh/ Besides that, my answer is "any programming language," since at the core, dealing properly with system calls and their outputs is the whole reason PL's exist. In practice, I've been using Rust lately which makes a nice systems language, but JS and Python are always options for shell-like scripts that don't suffer from quite the level of degeneracy when encountering weird filenames or unexpected input in general.
That would be a terrible shell. Changing directories, listing them, moving files, running programs are all simple no-brainer operations in any reasonable shell, but are non-trivial in any programming language that's not designed to be a shell.
So you use the shell for things that require no brain: browsing your directory tree, casual printing of files. Then, when you need to encode these operations in a script, you pull out a scripting language, because you need more than the shell can provide with its casual nature.
Legacy and backwards-compatibility. find(1) is a really
funny example, too, because POSIX find doesn't have that
many flags, so they could probably fit all of them into
the short format.
I've seen this a few times, but the one that always gets me are things like aws, it does something in response to "aws --help", but it doesn't tell you that you really want to call "aws help" to get some useful help.
I've seen that pattern before, but it always drives me a little crazy.
I'd rather have the suggestions, I don't take hints from computer software personally :) Sometimes I just misremember a particular command (i.e. "submodule" vs "submodules").
It’s unfortunate that Go’s standard package `flag` doesn’t follow the standard either, given the language is otherwise a good fit for command-line tools.
I ran into a related issue a couple of years back where people were using single-dash flags for a C++ project that was using Abseil flags in conjunction with getopt parsing of short flags (for legacy reasons). Why were they using single-dash flags, despite that not showing up anywhere in our documentation? They copy-pasted from --help.
(I'm happy to say that --help in Abseil has since been fixed.)
But that doesn’t preclude mistakes by collision (N short flags match a long one) or unpredictable bugs in a long flag interpreter (a short flag being a substring of a long one)—both being trivially common bugs when this ambiguity is allowed, especially when an API is ported to another environment with less tooling standardization around interpreting the input.
Go doesn't allow for specifying multiple short flags all run together, or for flag args without spaces, so neither of those are directly relevant here.
Also, that first issue happens with POSIX flags (with the GNU long flag extension, anyhow): `grep -help` is different from `grep --help` (and if you type the former, it'll just wait patiently for you to close stdin).
Which is also why Windows uses backslash (\) as their path separator. Because forward slash would have collided with the slash option marker Windows inherited from VMS.
That is surprisingly false. Microsoft operating systems use both / and \ as a path separator, going all the way back to DOS.
Early versions of MS-DOS made it a user preference option in the command.com interpreter, whether the user wanted to use / for options and \ for path separation or vice versa.
In longer words: Windows was originally a GUI system on top of DOS which was influenced by CP/M. The NT kernel did away with DOS, but the influence still lives to this day. For a simple one: not being able to name a file "con" (or any capitalized variation) comes all the way from CP/M.
For the uninitiated: OSes from that era didn't have "directories"; Everything lived in the root of the drive, including device files. So, to print a file, you could literally do something like:
A> type FILE.TXT > PRN
When DOS added directories, they retained this "feature" so programs unaware of what directories were could still print by writing to the `PRN` "file". Because of "backwards compatibility", NT still has this "feature" as well.
One thing VMS got right is that each binary declared its supported options and the shell could tell you what they were. And it would take any unique abbreviation.
Powershell scripts and cmdlets work similarly. They probably won't have help text but at least you can see what's available without having to look at the argument parsing section of the script. And you can use the shortest unique prefix as the short form of an argument (though I don't love this since adding an argument can break the shortened form of other arguments)
It'd make the typing simpler. PowerShell has posix-like aliases, like 'rm' and 'cd', but they don't accept POSIX parameters. So you end up with "rm -Recurse", since rm is an alias for Remove-ChildItem.
I like PS in theory but the syntax and naming just absolutely kill me. What were they smoking when they named as simple an operation as delete "Remove-ChildItem"? And what's with all of the capital letters?
That's what happens I guess when the people designing it haven't actually used a CLI day to day much, because, well, they're using Windows.
I can't agree. I have used Linux shells for some time (since 97), and while the olden days would be me laughing at vbs and all that awfulness, I'd take PowerShell any day.
The short terse commands and the really awkward, confusing, mistake prone syntax of sh or bash really reels their ugly head in scripts.
Interactive shell? No problem. But that's the beauty of PowerShell: verbosity and correctness in scripts, where the IDE quickly expands those long commands, and short aliases for interactive use.
> The short terse commands and the really awkward, confusing, mistake prone syntax
When used in an interactive shell short commands save time and effort. And it is easy to learn and remember them because in everyday work you need only about 10 commands. For some some commands which I use a lot I have one-two letter aliases to type even less e. g. i=fgrep.
It makes shell scripts less readable for someone who come from windows and and don't know even common shell commands, but for someone who use shell at least from time to time it should be easy to read.
Yeah I agree with that. Bash (and friends) scripts are awful. PS scripts are nice and readable, and not subject to the insane quirks of bash ([ vs [[ vs test? come on)
Seems like the real solution is separating scripts from interactive use.
Ironically it already happened: bash for user interface, but /bin/sh is something else. But bash for user interface keeps being a repl that was accidentally promoted to user interface.
> What were they smoking when they named as simple an operation as delete "Remove-ChildItem"?
Simple. All these commands work with providers, of which a file system is just one. Other providers include Windows Registry, environment variables, certificate stores, functions and variables in PowerShell runtime. More providers can also be created and plugged into the system. PowerShell Providers are essentially Window's FUSE. See [0] for details.
So, for instance, you can do `Get-ChildItem HKCU:` to list entries under HKEY_CURRENT_USER in the Registry, the same way `Get-ChildItem C:/` will list you top-level items on the C: drive. Worth observing: while the console output for these two commands is similar, the results are in fact different objects underneath (Microsoft.Win32.RegistryKey vs. System.IO.FileInfo).
In short, these commands are an abstraction over file-system-like things. Whether or not that was a good idea is a different question.
It makes a little more sense in context to me. The verbose Verb-Nounish works because the verbs are designed to be limited. E.g. there's Remove- but no Delete- in the standard (shown in `Get-Verb`). So you can then press ctrl+space after typing Remove- and see all the different types of things you can remove. Too many, so you can filter to Remove-<prefix>* etc. The verbosity of cmdlet names when using it as a shell is mitigated with the aliases (e.g rm), and the parameters by accepting any case and shortening to anything non-ambiguous (e.g. `rm -rec -fo`).
I guess the capitalisation comes from C# or .net's casing? I like PascalCase for it's great readability/conciseness tradeoff over others, and it's standard windows case-insensitive so I've never had a huge issue with it.
The tradeoff is that "all the things I can remove" is usually "the set of all things my shell knows about" and not "the set of things related to my task at the moment" -- ChildItem-* would be more helpful!
Neat thing you can do is type "*-Noun" and the tab completion will give you options that fill in the "*". Alternatively "Get-Command *-Noun" will also list out all of the matching commands. Get-Help also supports that kind of wildcard so you get the list of commands along with their help summary.
The "*" can even be in the middle. I open VS solution files all the time from Powershell. Since there are often many other files and folders with similar names alongside them I just type ".\*.sln" and hit tab.
I disagree and agree with the sentiment. As someone more familiar with Linux, I sure would prefer to be able to assume a similar style.
But the biggest thing I'm happy about WRT Powershell is that it's consistent (and pretty well documented). At least it makes sense. Batch scripting really didn't.
Except they did, and I for one wish traditional Unix shells would die. Composing software by having every single program and script include a half-assed parser and serializer is causing a lot of unnecessary waste and occasional security problems in computing. Moving structured data in pipes is just a better idea.
Wish I could (actually, I'd prefer JSONB or other binary format). Unfortunately, every program in the UNIX ecosystem assumes unstructured text in pipes, and makes it my responsibility to glue them together by building ad-hoc parsers with grep, head, sort, sed and awk.
A lot of more recent programs (such as AWS, K8s tools) can easily output JSON. You can make schemas match, but you'll most of the time need to use something like jq to transform what one program outputs into what makes sense for the other.
I always try to design my tools with a "terse" output that makes it easier to pipe it into other programs.
Fwiw I'm pretty sure that POSIX_ME_HARDER wasn't an RMS-ism. RMS invented the "-pedantic" gcc flag (to enable some warning messages that he felt weren't necessary) and that always got a laugh when he talked about it and that was more his style. POSIX_ME_HARDER was more of a signature style of one of the other devs at the time, rather than of RMS.
remember seeing this buried in some ifdefs somewhere or in compile output back in the day.. to me finding the origins this is almost or more interesting than the longopt story itself :)
Oh come on. I had a laugh reading it, but immediately understood why it was changed—voluntarily, not by anything resembling police. I wouldn’t wish real censorship on anyone, but I wish y’all were at least able to identify it with better accuracy than a poorly trained AI.
I think what's recent is the syntax for mixing short and long options together on the same command. Some commands had long options, some had short, but with "--" one command can have both.
Pretty sure some programs used them before 1990, just not with a convenient getopt_long(). I know it's not the best example, but 'dd' used things like "if=whatever skip=123" prior to 1990. The article also mentions find, but it used single dash long options.
Those aren't really options. The syntax of the find command is
find <options> <paths> <expression>
Those thing you list are part of the <expression> part of the command. The <options> part in BSD find, and I believe GNU find, only uses options of the form -X where X is a single character.
It's a little confusing because the man pages for both BSD and GNU find do call some of the things that appear in the <expression> part of the command "options".
> There were a few programs that ran on Unix systems and used long option names starting with either - or no prefix at all, such as find, but those syntaxes were not compatible with Unix getopt() and were parsed by ad-hoc code.
The Free Software Foundation held a public election on how to do long options three decades or so ago.
This was likely before the effects of Eternal September began destroying the public Usenet, so the vote may well have been held there, in one of the newsgroups relevant to the FSF, GCC or GNU.
The '--' alternative won overwhelmingly, as I rememeber it. A few hundred votes were cast by email.
Weirdly, this kind of syntactic idiosyncrasy is something that got me interested in Erlang. Finally a language that uses full stops when a routine full stops. I find most of the rest of its syntax uncomfortable (I didn’t spend much time with the language, I’m sure it’s fine when you’re used to it), but I always found it weird to end a completed statement with a statement-list-joining punctuation mark.
I thought of mentioning the Prolog heritage. Weirdly CSS (having the worst syntax consistency of any language I can think of) is hyphen-heavy and solves its negation infix operator ambiguity well: it needs to be surrounded by whitespace.
For Prolog/Erlang, I think the preceding syntax is disambiguating enough
I always thought this is a Wirthism because Pascal ends unit and program with an "end." (with a dot), whereas function and procedure are terminated with "end;". (with a semicolon). I don't know about other Wirth languages though, maybe it is Pascal specific and not really something typical for Wirth?
Erlang is great, and I got used to the punctuation, but it's kind of a pain when you're moving code around.
Oh, now this is the last thing, gotta take off the ; or replace a , with a .
At least when I was starting out, I'd have loved a more C-like syntax with {} and consistently semicolons. Of course, Elixir came and just got rid of most punctuation, which I like less.
Anyway, it's consistent and after a couple weeks of messing it up, I can consistently see where the mistake is from the compiler error; after several years, I still sometimes mess it up, but oh well. I can't recall having messed up the punctuation so much that it still compiled but wasn't what I meant, so it's almost always quick to recover.
I mean, if you want to undo subtracting something, you add it? The only reason I ever found it confusing is that it was "backwards", but that feels like a forced error due to + requiring a shift while - doesn't.
You could ask Richard Stallman himself, who is well known for responding to random inquiries from people.
Presumably it was rejected because the whole point of POSIX was to consolidate, regularize, and simplify pre-existing practice. Adding "+" as an additional standard option signifier would take a huge step in the complete opposite direction. The only precedence for "+" would have been the `set` shell builtin, and AFAIU the committee only begrudgingly grandfathered that syntax.
Someone elsethread mentioned the `date` utility, but if you look at the BSD implementation "+" isn't used as an option marker, per se, but rather to disambiguate operand strings. The 2001 standard only defined the following:
date [−u] [+format]
date [−u] mmddhhmm[[cc]yy]
It's splitting hairs, but POSIX was at least able to shoehorn the legacy syntax into a more regularized base interface.
In many astronomy programs, we have to put up with the PFILES convention, where arguments are given like so
mytool infile=foo.fits outfile=bar.fits
Some parameters need to be given (and will be prompted for if not given), and some get a default if they are omitted. The extra tricky part is that the parameters and their defaults are read from .par files in a path. There are the default ones for the tool, and a user-specific parameter file which can be modified using the "pset" program (or read with "pget"). A tool when run will also modify the par file to update various parameters with the ones given on the command line.
Unfortunately, there is no form of locking on these par files, so one has to mess around with the path settings (to make per-process paths), or use some form of locking, to ensure they don't get corrupted if multiple processes are run at once.
Kind of off-topic, but somehow I've found that blogs that use that particular template (with the blue top header and all) pretty much always have content that I find interesting or useful. Often they contain niche information that's hard to come by elsewhere. Has anybody felt similarly/know what might cause this?
It’s a rare moment on HN when a commonly maligned CMS written in a commonly maligned language gets such high praise for its default behavior. Just commenting here in hopes I can come back and reflect on the weird disconnect between UX and nerd preference.
I believe it is a WordPress default template. Perhaps it is because these authors are so focused on quality writing they have no time for frivolous nonsense like 'templates'.
It was the default Wordpress theme from ca. 2005-2010 which at least in my mind was basically the peak era of "the blogosphere" especially on technical topics.
As others have said it's a default wordpress theme, so often used by people who install wordpress in its default configuration because they want a reasonable https GUI to type content into a blog, but don't care about making it pretty or special looking (As you would if you were to use wordpress to build a company webpage).
I remember reading something about a debate between --options and -=options. It's pretty tough to search for this but maybe someone else knows where I might have come across it?
GNU stuff is so influential. It’s really remarkable. Copyleft, the open compiler, open desktop. The Linux desktop changed my life when I encountered it as a 12 yr old.
It’s hard to imagine that the philosophy of these early heroes has so pervaded the world. Free software is everywhere. Other fields don’t do this to the degree software does. So much value for humanity just because the first few took one approach when another would have worked just as well.
I always thought it had to do with chaining flags. For example, if multi-character single-dash flags were supported, and you wanted to write a `-lah` option for `ls, you wouldn't be able to or else you'd introduce ambiguity. And nobody wants to type `ls -l -a -h`.
Is this not part of POSIX? I see folks churning about Go and the Bell Labs people, but these styles were exactly what (in my mind) POSIX was partially in response to.
OpenBSD, just as one example of a true unix-derived system, added it's own getopt_long only in 3.3 release which was 2003, ~17 years after the 1st posix in 1988. This article mentions getopt_long originating ~1990, after 1st posix standard.
that's not the kind of option the article is talking about -- the long options are generally english words, and there is a limited set of them. Date uses two dashes for them -- like "--utc" and "--set". They won't work with +
Your example is a format string. date needed to tell if the positional argument is a format or a new date, and to make it easy, they decided to prefix the format with a special character. I am going to guess that this character should not be - (to avoid confusion with options), or numeric (to avoid confusion with new date), not have special meaning in common shells (to keep quoting simpler). They could have chosen : or ^ for example.
Completely agree, the implementer of date had many options. choosing + character, they apparently went for conciseness instead of convention.
Not sure what you are disagreeing with though, I made no claims that someone had to use +, or positional argument in general. Just an observation about what might have guided the design process.
I'm a programmer without any special finance knowledge and I also read it as being about the options the financial instrument at first, for some reason.