Why do long options start with two dashes? (2019)

attractivechaos · on April 8, 2021

It is sad that many new command-line parsing libraries don't follow the GNU rules anymore. They more often use "-long". Then users have to figure out whether this means "--long" or "-l -o -n -g". To make command line even more confusing, multiple tools I have used allow spaces in optional arguments (e.g. "-opt1 arg1 arg2 -opt2", where arg1 and arg2 set two values for -opt1). Every time I see this, I worry if I could be misusing these tools. I wish everyone could just follow getopt_long() and stop inventing their own weird syntax.

ben0x539 · on April 8, 2021

Yet another tragedy broughtabout by golang (at least in part)! :)

(https://golang.org/pkg/flag/#hdr-Command_line_flag_syntax)

Edit: To be clear, I'm mostly "blaming" Go for re-popularizing this style by a) putting it in the standard library and b) being a widely used programming language; I'm not saying Go came up with this or anything.

(idk about the space separated args tho that's even worse)

hnjst · on April 9, 2021

Cobra (https://github.com/spf13/cobra), which is a pretty popular library for Go CLI applications, behaves more like classical GNU tools. It also offers usage/help autogeneration and autocompletion for popular shells.

Not sure about the relevant point on compact short options syntax as in `tar -xvzf archive.tgz` though... (edit) after a quick & sloppy test it seems to work as expected

ben0x539 · on April 9, 2021

Yeah but `tar xvzf archive.tgz` also works so I remain wary of tar. Basically every time I have tried to do something that's not tfz or cfz or xfz, it went wrong until I checked the manpage.

hnjst · on April 9, 2021

Right. I probably picked one of the most flaky examples, sorry for that. Let's say `ls -lah` that (I hope...) is less ambiguous.

In my defense, the specific example I gave is valid for both GNU and BSD versions of tar. If I understood correctly, the issue you point to (order among short form flags) is related to the fact that `f` expects an argument and consequently has to appear in the last position.

ben0x539 · on April 9, 2021

Ah, it's not a direct argument when you omit the hyphen and fall into "traditional" mode. I think after years and years I can finally wrap my head around how that works. :D

chrisweekly · on April 9, 2021

speaking of `man`, why can't it be more like `tldr`?

pwdisswordfish8 · on April 9, 2021

Because learning from unexplained examples is useless.

coldtea · on April 9, 2021

Is it? That's how humans learn to speak their language, one of the most complex tasks they need to achieve in their life...

mauvehaus · on April 9, 2021

On the other hand, cocking up an unfamiliar phrase in a spoken language doesn't usually result in accidentally killing the listener.

I haven't yet had a computer ask for clarification when I used tar or dd in an uncommon and destructive way.

coldtea · on April 9, 2021

You don't "have to" use the examples, you can read them as get a feel, and read the captions to find the one that does what you want...

Which is faster and probably safer than scanning the documentation for individual flags and hopping you got the nuances right...

See, the two cases aren't:

(1) Thoroughly study man page -> (2) Become expert at the command's options (3) try command secure in your mastery of it

vs

(2) Check tldr examples -> (2) try command

They're rather:

(1) Open man page, (2) scan and skim the man page and the dozens of irrelevant flags, caveats, and obscure options, until you find some flags that look to do what you want, (3) half-read them, (4) try command

vs

(2) Check tldr examples, (2) find an example that does what you want (which is usually one of the covered use cases) (3) try the command using the example syntax

hnjst · on April 9, 2021

I guess you could consider https://github.com/tldr-pages/tldr.

I personally wouldn't touch that, but that's related to my allergies to JS ecosystem and predisposition to panic attacks when I see stuff like that https://github.com/tldr-pages/tldr/blob/master/package-lock.....

samatman · on April 9, 2021

I've switched to tealdeer: same database, rust implementation.

https://github.com/dbrgn/tealdeer

hnjst · on April 9, 2021

I'm generally satisfied by ZSH inline options summary, but I'm happy to see a sane instantiation of this, it clearly fits a need. Thanks for the pointer (and sorry for the troll :/).

chrisweekly · on April 9, 2021

"ZSH inline options summary"?

TIA if you could explain; is it native zsh or a plugin?

Geezus_42 · on April 9, 2021

There's also tealdeer in rust.

ericbarrett · on April 8, 2021

It's really a Googleism that was inherited by Go. I remember their open source C++ command-line library did the same thing.

jeffbee · on April 9, 2021

Google's command line flags library, known to the public as absl::Flags and formerly gflags, does not distinguish between --foo and -foo, these are both the flag "foo". Each flag has a unique name so there is never a short -f equivalent to --foo, and -foo can never mean -f -o -o.

The main design motivation of absl::Flags is that the flag definitions can appear in any module, not just main. Go inherits this. A quirk that Go did not inherit is gflags --nofoo alternate form of --foo=false.

This is all documented at https://gflags.github.io/gflags/#commandline, which is pretty much a verbatim export of the flags package documentation that a Google engineer would see internally.

MereInterest · on April 9, 2021

> The main design motivation of absl::Flags is that the flag definitions can appear in any module, not just main.

Well that's kind of horrifying. That means that command-line arguments are a form of global state, and can silently alter the behavior of the program without the calling scope noticing.

I'm kind of vary of these mechanisms, because I've been bitten by them before. There was a python library I used that read its configuration from sys.argv the first time an object from the library was constructed. I had a rather painful time debugging to find that my script accepting a -b argument resulted in the library switching to batch mode and suppressing all graphics. Dang it, those were my arguments, and the library had no right to go behind my back and look at arguments that hadn't been directly provided to it!

jeffbee · on April 9, 2021

If you think that's horrifying, what if I told you that a sufficiently-entitled operator of a given program can alter the flags at runtime ... using their web browser. https://twitter.com/jbeda/status/888635505201471490

MereInterest · on April 9, 2021

Oh my. I have a gut feeling that I don't like it one bit, though I tend to be a bit more generous on logging. Logging is one of the only cases where its presence or absence don't change the inputs or outputs of any function, nor any other observable effect of the program. Having or removing logs doesn't impact the testability of a function, unlike any other use of global configuration.

jeffbee · on April 9, 2021

You seem like a pretty reasonable person so prepare to be more shocked :-) In a glog stream like this, the things on the right side are not evaluated unless verbosity is on.

  VLOG(2) << expression_with_side_effect() << " LOL";

MereInterest · on April 9, 2021

I have on occasion been called a reasonable person, and good heavens! I could understand that in a functional language with lazy evaluation, but that doesn't fit at all with my mental model of how C++ works. It can't be a macro, because the VLOG parentheses would need to enclose the entire expression. It can't just be the normal operator<< , because then the expression would always be evaluated. I suppose expression_with_side_effects() could return an object that is implicitly convertible to string, and the actual side effects happen in that optional conversion, but that would require lots of cooperation from the user.

I'm almost scared to ask. How is that even implemented?

jeffbee · on April 9, 2021

It is macros. It expands, through several macros, to:

  !VLOG_IS_ON(level) ? (void) 0 : [a hack to stop compiler warnings] & LOG(INFO) << ...

kevinmgranger · on April 8, 2021

It's originally from Plan9, which predates Google.

naikrovek · on April 9, 2021

too few people understand this.

Go was designed by former Bell Labs people who worked on Unix, Plan9, or both. many things about Go that people attribute to "googlism" is really attributable to work done at Bell Labs.

QuercusMax · on April 9, 2021

In my experience at Google, only Go does flags like this. Everything else (python, java, c++, blaze) all use the same flag syntax, which is all via long args with two dashes.

re · on April 9, 2021

The Java ecosystem has historically used single-dash options, both the SDK tooling (e.g. `java -jar`, `javac -classpath`) and classic common libraries like Jakarta Commons CLI. It has moved away from it more in recent years so now you get a mishmash of single and double dashes depending on how old the option is. In some cases you end up with stuff like `java -showversion` which prints the version to stderr but ` java --show-version` which prints to stdout.

tdeck · on April 9, 2021

I have seen a mix. For example, many Android developer tools (not written in Go) use this single-dash style. I believe the standard libraries used for parsing in internal tools mostly support both syntaxes, although some docs do describe the old single-dash style by default.

ben0x539 · on April 9, 2021

This seems pretty "standard": https://fuchsia.googlesource.com/fuchsia/+/master/docs/conce...

Is it based on Google's internal preferences?

QuercusMax · on April 9, 2021

TBH I have no idea; I've heard of Fuchsia, but know nothing about it. It seems pretty far removed from the majority of work I've done in Google3 (the monorepo).

TeMPOraL · on April 9, 2021

PowerShell picked up the single-dash flag syntax too.

coldtea · on April 9, 2021

>many things about Go that people attribute to "googlism" is really attributable to work done at Bell Labs.

We're 50 to 30+ years away from that Bell Labs work. They could have checked what happened in the meantime with the rest of the computing world, before re-imposing obsolete ways with the full power of Google behind them...

4ad · on April 9, 2021

No, Plan 9 doesn't have long options at all.

GongOfFour · on April 9, 2021

I was going to say, it seems like something Google tooling prefers, even non go tooling.

Blahah · on April 8, 2021

It predates golang significantly. C and C++ bioinformatics tools have used single dash long opts since the 1990s, unfortunately. I expect the transgression didn't originate in the bioinformatics community.

attractivechaos · on April 9, 2021

Single-dash long options are not started in Bioinformatics, but they are more often used in this field than elsewhere. Perhaps that is partly because some of the most popular tools (e.g. blast, muscle, bedtools and gatk) followed this unfortunate convention.

COGlory · on April 9, 2021

I'd always assumed that was because people without significant deckers experience on a terminal are far more likely to type -help than --help or -h

CogitoCogito · on April 9, 2021

One thing go's flag's package does that deserves a lot of blame is to automatically sort the flags alphabetically when looking at -help. And the fact that you need to hack your away around it instead of there being simply an option like nosort=true or whatever is even worse. The whole idea is crazy and basically equivalent to the statement that there order of parameters in -help serves no useful purpose.

Cthulhu_ · on April 9, 2021

And yet, I expect flags to be sorted in a man page; I rarely read things in a logical order, I'm just looking into what flag does what.

It's a convention-over-configuration thing I think. I mean they set a standard, so you can move on. The alternative is to sit and think and discuss about what order to put your documentation in.

ljm · on April 9, 2021

You read text from top to bottom. Chances are that you're writing help text and describing the most commonly used flags at the top, and the more obscure ones lower down.

josefx · on April 9, 2021

> I'm just looking into what flag does what.

So you read the whole man page when you need a flag that does something specific or do you mean you never write new things and just have to look up flags already in use by some script? Because for everything else that seems like a fascinating waste of time.

_ZeD_ · on April 9, 2021

if I'm searching I'm... searching: like using grep with some keywords. Why order of the parameters should matter in this case?

josefx · on April 9, 2021

Because you don't always know which words the man page uses to describe specific functionality. So many ways to express similar ideas, language is fun that way.

rollcat · on April 9, 2021

Thankfully we have git.sr.ht/~sircmpwn/getopt github.com/pborman/getopt github.com/mattn/go-getopt rsc.io/getopt and a hundred more, but I really wish getopt was a part of the standard library.

ainar-g · on April 8, 2021

I think Go's package "flag" was partially inspired by the one made by Apache for Java, but I can't find any sources confirming that now, so I might have seen that in a dream, heh.

Seb-C · on April 9, 2021

I found it surprising too that the native library in Go does not follow this standard.

Fortunately, there are alternative packages that does.

mseepgood · on April 9, 2021

It's not a standard, it's a GNUism. Why would the inventor of Unix follow GNU, which is Not Unix.

_ZeD_ · on April 9, 2021

because it's better

mseepgood · on April 9, 2021

No, it's weird af to use two dashes.

mseepgood · on April 9, 2021

It's by Ken Thompson, who invented Unix, so it's ok.

toyg · on April 9, 2021

I think Powershell does it too.

KronisLV · on April 9, 2021

In my experience, i had never actually stumbled upon a formal list of these "GNU rules", the closest thing i can find is this: https://www.gnu.org/software/libc/manual/html_node/Argument-...

However, even those seem to raise some questions, for example:

> To make command line even more confusing, multiple tools I have used allow spaces in optional arguments (e.g. "-opt1 arg1 arg2 -opt2", where arg1 and arg2 set two values for -opt1).

Is described as something that's permissible:

> An option and its argument may or may not appear as separate tokens. (In other words, the whitespace separating them is optional.) Thus, ‘-o foo’ and ‘-ofoo’ are equivalent.

Therefore the below would be considered equivalent:

  -opt1 arg1 arg2 -opt2
  -opt1arg1arg2 -opt2

Were your expectations different?

Are there any good articles on the benefits of following such rules (any fungible improvements to legibility or usability, as opposed to just "consistency amongst different tools")?

Are there any tools which can validate whether any piece of software conforms to this standard (either by scanning the man pages, or the code, or a formalized format of parameters the app supports)? Personally, the closest i've found is Typer ( https://typer.tiangolo.com/ ) but without anything that can automatically reject non-conformant code as a part of a CI process, i think enforcing such formats would be a non-starter for me.

hzhou321 · on April 9, 2021

I think disallowing short option altogether is not a bad convention. With only `-long` and `-long=xxx`, my command line parsing is simply

    foreach a in arglist
        if a=~/^-(\w+)(?:=(.+))?/
            $opts{$1} = $2;
        else
            push @pos_list, $a

There is no need for the dependency and complexity of `getopt` library.

And for the user side, no more cryptic ninja arts. The only trick user need to learn is the shell alias and functions.

orev · on April 9, 2021

The point isn’t to use getopt with all its complexity, the point is that two dashes for long options is already extremely well established and Go popularizing long options with a single dash is very much a regression. It creates a lot of unnecessary confusion when they really should have known better and just stuck with the conventions.

In your regex at least, removing the confusion is simple as adding another ‘-‘, and now you’re in compliance with the expectations of almost every IT person in the world who uses Unix command lines.

bdhess · on April 9, 2021

Go's flag parser treats two leading minus signs the same as one.

https://golang.org/pkg/flag/#hdr-Command_line_flag_syntax

d21d3q · on April 9, 2021

Sure, but convention is to treat `-long` as `-l -o -n -g'.

tsimionescu · on April 9, 2021

But that is a bad convention, it prioritizes typing speed over readability, and it only works in certain cases (for flag-only parameters). I'd say good riddance to it.

lifthrasiir · on April 9, 2021

That doesn't justify long options starting with a single dash, as one could have made every option start with two dashes. Sure, `--` is longer than `-`, but typing speed shouldn't matter right?

tsimionescu · on April 9, 2021

Sure, but the only reason to add an extra dash is to differentiate --long from -l -o -n -g. No reason to just add extra characters if ypu don't need this differentiation. Not to mention, Go cmd line parsing actually accepts both -long and --long, if you find the -- version more aesthetically pleasing.

cowl · on April 9, 2021

If we only did "the accepted convention" indefinitely there would be no going forward. I see this change(of being explicit) as a win. The situation was already confusing before with different tools using different conventions. This way of being explicit allows you to be consistent across OS-s too. The world is not only GNU, fortunately.

GoblinSlayer · on April 9, 2021

Buffer overflows are well established too, not all traditions are good.

bastawhiz · on April 9, 2021

GCC does this pretty heavily, no?

https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html

If it mattered much, I'd expect GNU to be internally consistent.

avar · on April 9, 2021

GCC or anyone else's C compiler doesn't really count as far as this convention goes. GCC's flag parsing is aiming to be compatible with conventions that preceded GNU, other vendor's compilers break their own conventions to be compatible with GCC or whatever else cc(1) is etc.

GNU's conventions are generally complimentary, but not incompatible with POSIX. And POSIX specifies the behavior of the sort of flags cc(1) should understand[1].

There are many POSIX and other traditional *nix tools that are a convention unto themselves for historical reasons. E.g. notice how GNU "dd" doesn't follow normal GNU command-line conventions either.

1. https://pubs.opengroup.org/onlinepubs/7908799/xcu/cc.html

sillysaurusx · on April 9, 2021

-long = -l -o -n -g was probably a mistake. But then again, tar xvzf should have been called untar, so it’s not like there are a shortage of opinions and historical mistakes.

necovek · on April 9, 2021

How would one call tar `xvjf` then?

lightbulbjim · on April 9, 2021

You can just call ‘tar xvf’ and it will detect the compression format.

necovek · on April 9, 2021

My question was about how "tar xvzf" should be called "untar".

Your reply might still make sense (i.e. untar could automagically figure it out), but I was highlighting how tar/untar today also means (de)compressing that tar archive using many different compression formats.

nicoburns · on April 9, 2021

It should probably be `untar --format gzip`

echlebek · on April 8, 2021

They are just respecting older, more venerable tools, like find.

antihero · on April 9, 2021

Find's syntax is really annoying and clashy too. It makes sense. -- for extended syntax and - for shorthand. Why can't it follow this. Also fuck dd

kazinator · on April 9, 2021

I'd like a word with the person who thought that regular ( ) parentheses for grouping were a good idea in the find syntax, requiring them to be backslash-escaped in shell scripts.

The obviously right choice would have been [ ]. You know, like in

   if [ $foo ... ]

spc476 · on April 9, 2021

That won't work because '[' is an alias for test.

jlokier · on April 9, 2021

"[" as an argument to another command is fine.

"[" is just another name for the "test" command. It isn't special syntax.

scbrg · on April 9, 2021

It's pretty silly to ask a command written in the seventies(?) to follow a convention (for another operating system) presented in the nineties.

frigid · on April 9, 2021

dd I really don't mind, because it makes me think and double-check that I'm flashing the right device every time.

find is annoying, though. I'd encourage you to check out fd.

https://github.com/sharkdp/fd

antihero · on April 9, 2021

Ah, thank you very much!

antegamisou · on April 9, 2021

Same with ffmpeg

bcrl · on April 9, 2021

The Amiga had a pretty cool feature where CLI argument parsing and help was provided via a library. This made things nicely consistent across almost all of the CLI tools.

asveikau · on April 9, 2021

Suddenly I'm reminded of how Windows represents the command line as a single string (PWSTR), and how entry points that expect argv-style are parsed by the CRT at startup.

vs. Unix where char *argv[] is what makes it to the syscall layer.

The result there is that command line processing is more consistent program-to-program on Unix. On Windows, every program could decide to tokenize the arguments differently.

account42 · on April 9, 2021

> On Windows, every program could decide to tokenize the arguments differently.

Worse, even Microsofts two implementations (CRT and WINAPI) disagree: https://github.com/rust-lang/rust/issues/44650

asveikau · on April 9, 2021

I feel like there are a few interesting Microsoft phenomena that contrast with Unix thinking in both of these examples.

CommandLineToArgvW - You called that "WINAPI", but it's worth mentioning the more specific provenance of shlwapi.dll. This is not a core, foundational part of Windows that is used in core, foundational things. It's a helper function from the shell (explorer, not shell in the Unix sense). So, while it has a look and function that seems pretty foundational, it really isn't. It's there because somebody working on Explorer long ago found that useful to have and decided to export their helper function in the DLL.

CRT - A CRT binary ships with Windows, but really, that code is maintained and distributed by the compiler guys and DevDiv. So theoretically, the argv parser could change at those people's whim alongside a new Visual Studio release. And it seems from squinting at that github issue like that might have happened here.

So really ... there are more artifacts here attesting to the fact that the command line arg parser is not part of the operating system. People find that functionality useful, so they look for things that "look like" the operating system official method, and maybe they find stuff that does "look like it" -- but such a thing isn't really there.

account42 · on April 12, 2021

I was not arguing that it was or was not part of the OS but just showing that the parsing being deferred to application code has produce two subtly incompatible implementations that differ for no reason other than that they do.

asveikau · on April 13, 2021

Yeah, I am not considering anything you say to be argumentative, I am just going in tangents with this topic because I have some experience there and find it interesting.

billpg · on April 10, 2021

That's a good thing. You have to be careful using a command line SQL query when typing "SELECT ". If the processing is left to the program, an SQL app in Windows knows you didn't mean "" to mean all the files in the current folder.

kccqzy · on April 9, 2021

https://man7.org/linux/man-pages/man3/getopt.3.html

Hnrobert42 · on April 9, 2021

How is that different from the getopts library? I don’t understand.

mixmastamyk · on April 9, 2021

I believe it was standardized and built in, add peer pressure and it was just short of being enforced.

bcrl · on April 9, 2021

There were also the Amiga style guides that were published with 2.0 that detailed how developers should build application user interfaces. The fragmentation in Linux/Unix distributions means that this kind of consistency is pretty much impossible, although FreeBSD does a much better job of being consistent than $majorlinuxdistros.

k__ · on April 8, 2021

Really?

I had the impression only CLI tools from the Java world are that strange.

mrits · on April 9, 2021

Yes, I messed up -Xmx1024m a million times in my career. We used GNU for in-house stuff so sometimes I'd have --Xmx and -Xmx on the same line

chrisweekly · on April 9, 2021

yeah; long w --, short w - is intuitive and annoyingly close to universal...

dwheeler · on April 9, 2021

I love the "long options start with two dashes" convention. It means that you can chose short options that are easily combined (in cases where the command and its options are often used), or you can use long options that are much easier to understand (because they are full words). More command line tools should support them.

jonhohle · on April 9, 2021

I typically use long options in shell scripts that will be checked in or shared with others. The self documenting nature of long opts is much nicer (imho) than the terseness of short ones.

I’m also glad short opts are available for my personal day-to-day work. I spend most of my time in a a terminal and appreciate having short-hand available.

mickael-kerjean · on April 8, 2021

It always confused me when tools don't follow that rule. Eg: "find" where "find . --name '.dat'" won't work but "find . -name '.dat'" will and it's not the only one

ben0x539 · on April 9, 2021

`find` is weird anyway. The stuff after the arguments aren't really flags, they're a tiny filter language, with significant ordering and operator precedence and all that stuff. Using "normal" option syntax wouldn't make a lot of sense for it either.

chubot · on April 9, 2021

Yes exactly, find is like test a.k.a [.

    test -f foo -a -f bar -o -z foo

can be read

    isfile('foo') && isfile('bar') || emptystring('foo')

Likewise

    find . -name '*.py' -a -executable -o -printf '%P\n'

can be read

    (F.name matches '*.py') && (F is executable) || print('%P\n', F)

where F is the current node in the file system traversal.

They both respect -o as OR, ! as NOT, and ( ) for precedence, which you have to quote as $ and $.

A couple years ago, someone helped me implement a better "find" without this wonky syntax for https://www.oilshell.org/ . But it isn't done and needs some love. If anyone wants to help, feel free to join Zulip :)

I do think that "find" is more like a language than a command line tool. It's pretty powerful, e.g. I just used it to sort through 20 years of haphazard personal backups.

chubot · on April 9, 2021

Related: Problems With the test Builtin: What Does -a Mean?

http://www.oilshell.org/blog/2017/08/31.html

ainar-g · on April 9, 2021

Thanks for the article! How did I not see this before? Didn't know that POSIX has obsoleted -a and -o either. I guess I have some shell scripts to rewrite, heh.

ben0x539 · on April 9, 2021

I feel like `jq` does the "more like a language" thing better than `find`, but possibly its just a product of its time.

chubot · on April 9, 2021

Well the thing find and test have in common is that they lack a lexer! They abuse the argv array for tokens instead. I might call it the "I'm too lazy to write a lexer" pattern :)

jq has a lexer and hence a "real" syntax, but so does awk, which is maybe 30 years older. But yes jq is a surprisingly big and rich language, maybe bigger than awk:

https://stedolan.github.io/jq/manual/

grishka · on April 9, 2021

Find is about as user-unfriendly as a shell command could be. I never get it to do what I want on the first try. And its error messages are always cryptic and unhelpful.

throwawayboise · on April 9, 2021

I don't think any shell commands are particularly "friendly." Most are intentionally terse (in fact I find verbose, "friendly" command options to be annoying), and you learn them by repeated use, or for those that you use only occasinally, by consulting the man pages.

grishka · on April 9, 2021

Yes, but errors are at least somewhat helpful. With find, it's this:

    $ find -name something
    find: illegal option -- n
    usage: find [-H | -L | -P] [-EXdsx] [-f path] path ... [expression]
           find [-H | -L | -P] [-EXdsx] -f path [path ...] [expression]

What does "illegal option" mean exactly? Why is it "n" which is the first letter of "-name"? Yes, it wants a path. Yes, even if you want to search in the current directory. Yes, it IS unusual, because all other commands that operate on directories, like `ls`, assume current directory if you don't specify any.

Why could it not just say "a path is required" instead?

wahern · on April 9, 2021

It's saying that because it's using getopt to parse any initial option arguments. That diagnostic message is the standard default message printed by the getopt function whenever encountering an invalid option flag. It means all utilities using getopt will, unless you disable the default behavior, display the same initial diagnostic. It's idiomatic for utilities to then print a short usage message of its own.

Judging by the usage message you printed, you were almost certainly using a BSD implementation, probably on macOS, which in turn is probably sync'd from FreeBSD. `find -name something` will fail early in main. See https://github.com/freebsd/freebsd-src/blob/b422540/usr.bin/... When processing the 'n' in '-name' getopt() will return '?', which will end up calling usage().

The GNU implementation of find is completely different, though I'm not sure it does what you expect:

  $ find -name something

That prints nothing and returns a successful exit code. But if you remove the "something" operand you get what I presume you were originally expecting as an error message:

  $ find -name
  find: missing argument to `-name'

But try deciphering the option processing of GNU find to understand why it behaves that way: https://git.savannah.gnu.org/cgit/findutils.git/tree/find/ft... Hint, see https://git.savannah.gnu.org/cgit/findutils.git/tree/find/ut...

Not rocket science, but as a programmer and maintainer which approach do you think makes more sense? Is trying to do the supposedly intuitive thing worth it, especially considering find's already arcane and irregular syntax? As an experienced command-line user I'd just be thankful that the option flags (as opposed to the filter directives) are parsed regularly.

ldarby · on April 10, 2021

This is a good explaination why it has the current behaviour, but it doesn't answer the question of why the behaviour isn't better (i.e. which would be to tell the user what's needed, the path, instead of telling the user what was provided is not what's needed which is vague and leaves it up to the user to figure it out.)

It's not like the source code is now etched into stone and can't be changed. Or is it?

arp242 · on April 9, 2021

GNU find, or at least my version of GNU find (4.8.0), will just assume "." if the path is missing, and will work as expected. I think various forms of BSD find are a bit more strict, and based on that usage message is seems to be BSD find.

toast0 · on April 9, 2021

It gave you the list of options (i think that's at most one of -H and friends, as many as you like of -E and friends, -f with an argument), and -n isn't one of them.

Several BSD commands are pickier than GNU commands about option order, sometimes for good reason, sometimes because it was easier to write that way.

ironmagma · on April 9, 2021

This is why I've ultimately come to the conclusion that shells are for casual use only, not for any kind of serious work. There are too many implementation details, inconsistencies, and footguns to write anything that needs to be somewhat reliable.

rbanffy · on April 9, 2021

What do you use instead?

ironmagma · on April 9, 2021

To be fair, there is one shell that I think someday we could rely on. https://www.nushell.sh/ Besides that, my answer is "any programming language," since at the core, dealing properly with system calls and their outputs is the whole reason PL's exist. In practice, I've been using Rust lately which makes a nice systems language, but JS and Python are always options for shell-like scripts that don't suffer from quite the level of degeneracy when encountering weird filenames or unexpected input in general.

rbanffy · on April 12, 2021

> my answer is "any programming language,"

That would be a terrible shell. Changing directories, listing them, moving files, running programs are all simple no-brainer operations in any reasonable shell, but are non-trivial in any programming language that's not designed to be a shell.

ironmagma · on April 12, 2021

So you use the shell for things that require no brain: browsing your directory tree, casual printing of files. Then, when you need to encode these operations in a script, you pull out a scripting language, because you need more than the shell can provide with its casual nature.

ainar-g · on April 8, 2021

Legacy and backwards-compatibility. find(1) is a really funny example, too, because POSIX find doesn't have that many flags, so they could probably fit all of them into the short format.

thitcanh · on April 9, 2021

If anyone is looking for alternatives, try fd

https://github.com/sharkdp/fd

beltsazar · on April 9, 2021

fd is even more than that. In most cases `fd -x` can replace `find ... | xargs ...`.

Hnrobert42 · on April 9, 2021

From the article, it sounds like find predates the two dashes convention, so I think it gets a pass.

m463 · on April 9, 2021

also:

  tar xvf foo.tar

banana_giraffe · on April 9, 2021

I've seen this a few times, but the one that always gets me are things like aws, it does something in response to "aws --help", but it doesn't tell you that you really want to call "aws help" to get some useful help.

I've seen that pattern before, but it always drives me a little crazy.

rbanffy · on April 9, 2021

That. And Git’s passive aggressive approach when it kind of knows what you wanted, but will show you how you failed to write that command.

dvdkon · on April 9, 2021

What's the alternative? Accepting the dev's favourite misspelling npm-style doesn't seem like a good idea to me.

rbanffy · on April 9, 2021

No, but a "command not found" would suffice and not hint that it could have figured it out for you.

Alternatively, it could offer a y/n prompt.

dvdkon · on April 9, 2021

I'd rather have the suggestions, I don't take hints from computer software personally :) Sometimes I just misremember a particular command (i.e. "submodule" vs "submodules").

kdmytro · on April 10, 2021

Git has an option to run misspelled commands anyway.

Pxtl · on April 8, 2021

Still annoyed that PowerShell didn't follow POSIX standard for arguments, at a time when MS was working hard on open-source compatibility.

pansa2 · on April 8, 2021

It’s unfortunate that Go’s standard package `flag` doesn’t follow the standard either, given the language is otherwise a good fit for command-line tools.

vitus · on April 8, 2021

By standard, do you mean -s for short flag, and --two_dashes_for_long_flags?

Because if you don't care about chaining together short flags and just want to use two dashes for your long flags, Go will happily accept that.

https://golang.org/pkg/flag/ : "One or two minus signs may be used; they are equivalent."

tqkxzugoaupvwqr · on April 9, 2021

But help prints single dash for long flags which contributes to the fall of double dash long flags.

vitus · on April 9, 2021

Oh, I agree.

I ran into a related issue a couple of years back where people were using single-dash flags for a C++ project that was using Abseil flags in conjunction with getopt parsing of short flags (for legacy reasons). Why were they using single-dash flags, despite that not showing up anywhere in our documentation? They copy-pasted from --help.

(I'm happy to say that --help in Abseil has since been fixed.)

eyelidlessness · on April 9, 2021

But that doesn’t preclude mistakes by collision (N short flags match a long one) or unpredictable bugs in a long flag interpreter (a short flag being a substring of a long one)—both being trivially common bugs when this ambiguity is allowed, especially when an API is ported to another environment with less tooling standardization around interpreting the input.

vitus · on April 9, 2021

Go doesn't allow for specifying multiple short flags all run together, or for flag args without spaces, so neither of those are directly relevant here.

Also, that first issue happens with POSIX flags (with the GNU long flag extension, anyhow): `grep -help` is different from `grep --help` (and if you type the former, it'll just wait patiently for you to close stdin).

neolog · on April 9, 2021

Because of Go, I have to monitor what language a command-line program is written in before using it.

muterad_murilax · on April 8, 2021

>at a time when MS was working hard on open-source compatibility.

You mean around 2002-2006? I find that pretty hard to believe.

airstrike · on April 8, 2021

I feel like the open-source compatibility paradigm really started right after PowerShell

eyelidlessness · on April 9, 2021

It was probably even partly because of the reception to PowerShell.

voidfunc · on April 8, 2021

/Options are the norm for Windows (and uh cough VMS).

munificent · on April 9, 2021

Which is also why Windows uses backslash (\) as their path separator. Because forward slash would have collided with the slash option marker Windows inherited from VMS.

kazinator · on April 9, 2021

That is surprisingly false. Microsoft operating systems use both / and \ as a path separator, going all the way back to DOS.

Early versions of MS-DOS made it a user preference option in the command.com interpreter, whether the user wanted to use / for options and \ for path separation or vice versa.

throwawayboise · on April 9, 2021

IIRC the "forward slash" convention for Windows command options traces back to DOS, not VMS. Where DOS inherited the convention from I do not know.

colejohnson66 · on April 9, 2021

CP/M https://en.wikipedia.org/wiki/CP/M

In longer words: Windows was originally a GUI system on top of DOS which was influenced by CP/M. The NT kernel did away with DOS, but the influence still lives to this day. For a simple one: not being able to name a file "con" (or any capitalized variation) comes all the way from CP/M.

For the uninitiated: OSes from that era didn't have "directories"; Everything lived in the root of the drive, including device files. So, to print a file, you could literally do something like:

    A> type FILE.TXT > PRN

When DOS added directories, they retained this "feature" so programs unaware of what directories were could still print by writing to the `PRN` "file". Because of "backwards compatibility", NT still has this "feature" as well.

JoBrad · on April 9, 2021

Short article on this: https://web.archive.org/web/20150823053040/http://blogs.msdn...

erik_seaberg · on April 9, 2021

One thing VMS got right is that each binary declared its supported options and the shell could tell you what they were. And it would take any unique abbreviation.

ziml77 · on April 9, 2021

Powershell scripts and cmdlets work similarly. They probably won't have help text but at least you can see what's available without having to look at the argument parsing section of the script. And you can use the shortest unique prefix as the short form of an argument (though I don't love this since adding an argument can break the shortened form of other arguments)

JoBrad · on April 9, 2021

It’s easy (although verbose) to add help text, and valid options, too.

ziml77 · on April 9, 2021

Also easy to create option sets so that mutually exclusive arguments are shown in the help as different ways to invoke the script.

TeMPOraL · on April 9, 2021

And bunch of other niceties, all queryable without running the script, and all feeding autocomplete with useful information:

https://news.ycombinator.com/item?id=26748549

tzs · on April 9, 2021

..and TOPS-10 and TOPS-20 and RT-11 and RSX-11 and RSTS-11.

bargle0 · on April 9, 2021

They were for DOS, too. Not that I’m disputing the VMS roots of WNT.

lordgroff · on April 8, 2021

What would be a good reason to have POSIX standards in PowerShell, aside from, that's what POSIX does?

freeone3000 · on April 8, 2021

It'd make the typing simpler. PowerShell has posix-like aliases, like 'rm' and 'cd', but they don't accept POSIX parameters. So you end up with "rm -Recurse", since rm is an alias for Remove-ChildItem.

xxpor · on April 9, 2021

I like PS in theory but the syntax and naming just absolutely kill me. What were they smoking when they named as simple an operation as delete "Remove-ChildItem"? And what's with all of the capital letters?

That's what happens I guess when the people designing it haven't actually used a CLI day to day much, because, well, they're using Windows.

lordgroff · on April 9, 2021

I can't agree. I have used Linux shells for some time (since 97), and while the olden days would be me laughing at vbs and all that awfulness, I'd take PowerShell any day.

The short terse commands and the really awkward, confusing, mistake prone syntax of sh or bash really reels their ugly head in scripts.

Interactive shell? No problem. But that's the beauty of PowerShell: verbosity and correctness in scripts, where the IDE quickly expands those long commands, and short aliases for interactive use.

citrin_ru · on April 9, 2021

> The short terse commands and the really awkward, confusing, mistake prone syntax

When used in an interactive shell short commands save time and effort. And it is easy to learn and remember them because in everyday work you need only about 10 commands. For some some commands which I use a lot I have one-two letter aliases to type even less e. g. i=fgrep.

It makes shell scripts less readable for someone who come from windows and and don't know even common shell commands, but for someone who use shell at least from time to time it should be easy to read.

xxpor · on April 9, 2021

Yeah I agree with that. Bash (and friends) scripts are awful. PS scripts are nice and readable, and not subject to the insane quirks of bash ([ vs [[ vs test? come on)

Seems like the real solution is separating scripts from interactive use.

GoblinSlayer · on April 9, 2021

Ironically it already happened: bash for user interface, but /bin/sh is something else. But bash for user interface keeps being a repl that was accidentally promoted to user interface.

TeMPOraL · on April 9, 2021

> What were they smoking when they named as simple an operation as delete "Remove-ChildItem"?

Simple. All these commands work with providers, of which a file system is just one. Other providers include Windows Registry, environment variables, certificate stores, functions and variables in PowerShell runtime. More providers can also be created and plugged into the system. PowerShell Providers are essentially Window's FUSE. See [0] for details.

So, for instance, you can do `Get-ChildItem HKCU:` to list entries under HKEY_CURRENT_USER in the Registry, the same way `Get-ChildItem C:/` will list you top-level items on the C: drive. Worth observing: while the console output for these two commands is similar, the results are in fact different objects underneath (Microsoft.Win32.RegistryKey vs. System.IO.FileInfo).

In short, these commands are an abstraction over file-system-like things. Whether or not that was a good idea is a different question.

--

[0] - https://docs.microsoft.com/en-us/powershell/module/microsoft...

GSGBen · on April 9, 2021

It makes a little more sense in context to me. The verbose Verb-Nounish works because the verbs are designed to be limited. E.g. there's Remove- but no Delete- in the standard (shown in `Get-Verb`). So you can then press ctrl+space after typing Remove- and see all the different types of things you can remove. Too many, so you can filter to Remove-<prefix>* etc. The verbosity of cmdlet names when using it as a shell is mitigated with the aliases (e.g rm), and the parameters by accepting any case and shortening to anything non-ambiguous (e.g. `rm -rec -fo`). I guess the capitalisation comes from C# or .net's casing? I like PascalCase for it's great readability/conciseness tradeoff over others, and it's standard windows case-insensitive so I've never had a huge issue with it.

freeone3000 · on April 9, 2021

The tradeoff is that "all the things I can remove" is usually "the set of all things my shell knows about" and not "the set of things related to my task at the moment" -- ChildItem-* would be more helpful!

ziml77 · on April 9, 2021

Neat thing you can do is type "*-Noun" and the tab completion will give you options that fill in the "*". Alternatively "Get-Command *-Noun" will also list out all of the matching commands. Get-Help also supports that kind of wildcard so you get the list of commands along with their help summary.

The "*" can even be in the middle. I open VS solution files all the time from Powershell. Since there are often many other files and folders with similar names alongside them I just type ".\*.sln" and hit tab.

mseepgood · on April 9, 2021

> What were they smoking when they named as simple an operation as delete "Remove-ChildItem"?

The long names are the official readable names for scripting. It can and does have short aliases like "rm" that you would use in interactive mode.

> And what's with all of the capital letters?

PowerShell is case-insensitive. The capital letters are for readability.

Poiesis · on April 9, 2021

I disagree and agree with the sentiment. As someone more familiar with Linux, I sure would prefer to be able to assume a similar style.

But the biggest thing I'm happy about WRT Powershell is that it's consistent (and pretty well documented). At least it makes sense. Batch scripting really didn't.

Pxtl · on April 9, 2021

Just annoyingly inconsistent when calling PowerShell commandlet vs local exes.

908B64B197 · on April 9, 2021

PowerShell is different enough that maybe it's not a bad thing?

Seeing functions aliased to their POSIX names is already a little bit misleading when you realize they are not a drop-in replacement at all.

rbanffy · on April 9, 2021

PowerShell was born with a “we know better” attitude that, I hope, is gone by now.

Because they really didn’t.

TeMPOraL · on April 9, 2021

Except they did, and I for one wish traditional Unix shells would die. Composing software by having every single program and script include a half-assed parser and serializer is causing a lot of unnecessary waste and occasional security problems in computing. Moving structured data in pipes is just a better idea.

rbanffy · on April 9, 2021

Then use JSON or XML in those pipes. Nothing forces you to deal with unstructured data.

TeMPOraL · on April 9, 2021

Wish I could (actually, I'd prefer JSONB or other binary format). Unfortunately, every program in the UNIX ecosystem assumes unstructured text in pipes, and makes it my responsibility to glue them together by building ad-hoc parsers with grep, head, sort, sed and awk.

rbanffy · on April 9, 2021

A lot of more recent programs (such as AWS, K8s tools) can easily output JSON. You can make schemas match, but you'll most of the time need to use something like jq to transform what one program outputs into what makes sense for the other.

I always try to design my tools with a "terse" output that makes it easier to pipe it into other programs.

dylan604 · on April 8, 2021

"POSIX_ME_HARDER" passive aggressive done well

throwaway81523 · on April 9, 2021

Fwiw I'm pretty sure that POSIX_ME_HARDER wasn't an RMS-ism. RMS invented the "-pedantic" gcc flag (to enable some warning messages that he felt weren't necessary) and that always got a laugh when he talked about it and that was more his style. POSIX_ME_HARDER was more of a signature style of one of the other devs at the time, rather than of RMS.

dylan604 · on April 9, 2021

maybe so, but it's still brilliant

enriquto · on April 9, 2021

Do you recall who was that other dev?

throwaway81523 · on April 11, 2021

Yes, but I don't know what he's doing now and I wouldn't want to put his name here unless he said it was ok.

enriquto · on April 11, 2021

fair. Let's hope this part of history is written some day.

cat199 · on April 9, 2021

remember seeing this buried in some ifdefs somewhere or in compile output back in the day.. to me finding the origins this is almost or more interesting than the longopt story itself :)

enriquto · on April 8, 2021

sssht! be quiet! the moral police are going to ban this as well!

eyelidlessness · on April 9, 2021

Oh come on. I had a laugh reading it, but immediately understood why it was changed—voluntarily, not by anything resembling police. I wouldn’t wish real censorship on anyone, but I wish y’all were at least able to identify it with better accuracy than a poorly trained AI.

jancsika · on April 9, 2021

Problem is, all of us including GP are primed by poorly trained AI to take comments in a divisive direction.

eyelidlessness · on April 9, 2021

I’m primed by poorly trained cognitive chemistry to ignore that I know that and try to make lemonade

cecilpl2 · on April 8, 2021

I was surprised to read that long options were invented so recently, in only 1990.

haolez · on April 9, 2021

Well... at that time, paper terminals were still in use in some places. The gap between legacy and modern hardware that was in operation was huge.

adrianmonk · on April 8, 2021

I think what's recent is the syntax for mixing short and long options together on the same command. Some commands had long options, some had short, but with "--" one command can have both.

tyingq · on April 8, 2021

Pretty sure some programs used them before 1990, just not with a convenient getopt_long(). I know it's not the best example, but 'dd' used things like "if=whatever skip=123" prior to 1990. The article also mentions find, but it used single dash long options.

tyingq · on April 9, 2021

Found an example that's a bit closer, from Minix 1.0, 1987. The "pr" program:

  Usage: pr [+page] [-columns] [-h header] [-w with] [-l length] [-nt] [files]

Mixed in this case, with -columns and +page, and all hand parsed. But long options nonetheless.

toast0 · on April 9, 2021

Isn't that expecting say pr +4 -172

to format for printing starting at page 4, on a wide 172 colunn paper?

tyingq · on April 9, 2021

Ah, yes, you're right...

adrianmonk · on April 8, 2021

I think the "find" command was around before then, and it has long options. For example:

    find dir -type f -name "*.h" -print

tzs · on April 9, 2021

Those aren't really options. The syntax of the find command is

  find <options> <paths> <expression>

Those thing you list are part of the <expression> part of the command. The <options> part in BSD find, and I believe GNU find, only uses options of the form -X where X is a single character.

It's a little confusing because the man pages for both BSD and GNU find do call some of the things that appear in the <expression> part of the command "options".

macintux · on April 9, 2021

Find is specifically called out in the article.

> There were a few programs that ran on Unix systems and used long option names starting with either - or no prefix at all, such as find, but those syntaxes were not compatible with Unix getopt() and were parsed by ad-hoc code.

vinsci · on April 9, 2021

The Free Software Foundation held a public election on how to do long options three decades or so ago.

This was likely before the effects of Eternal September began destroying the public Usenet, so the vote may well have been held there, in one of the newsgroups relevant to the FSF, GCC or GNU.

The '--' alternative won overwhelmingly, as I rememeber it. A few hundred votes were cast by email.

_lffv · on April 8, 2021

Why did POSIX disallow +options?

pdw · on April 8, 2021

The shell uses "-" to enable an option and "+" to disable it. I don't know who's to blame for this, but it's part of the POSIX standard.

naikrovek · on April 9, 2021

> The shell uses "-" to enable an option and "+" to disable it.

things like this are how I know that there are too many people who are absolutely insane and who make mundane decisions.

    - to add
    + to subtract

absolute genius. is this what higher education teaches people? I didn't go to college, and maybe that was best.

dogecoinbase · on April 9, 2021

One of my favorites is in POSIX date format strings:

    %p     locale's equivalent of either AM or PM
    %P     like %p, but lower case

eyelidlessness · on April 9, 2021

Weirdly, this kind of syntactic idiosyncrasy is something that got me interested in Erlang. Finally a language that uses full stops when a routine full stops. I find most of the rest of its syntax uncomfortable (I didn’t spend much time with the language, I’m sure it’s fine when you’re used to it), but I always found it weird to end a completed statement with a statement-list-joining punctuation mark.

samatman · on April 9, 2021

This was inherited from Prolog, which ends terms with a full stop.

Most other languages didn't want to handle the syntactic ambiguity of using the period as a decimal point and a statement separator.

eyelidlessness · on April 9, 2021

I thought of mentioning the Prolog heritage. Weirdly CSS (having the worst syntax consistency of any language I can think of) is hyphen-heavy and solves its negation infix operator ambiguity well: it needs to be surrounded by whitespace.

For Prolog/Erlang, I think the preceding syntax is disambiguating enough

GoblinSlayer · on April 9, 2021

COBOL terminates some statements with a period. And before that FLOW-MATIC https://en.wikipedia.org/wiki/FLOW-MATIC

eythian · on April 9, 2021

> I always found it weird to end a completed statement with a statement-list-joining punctuation mark.

But if you're producing a list of statements, isn't a statement-list-joining punctuation mark the perfect thing to use?

There's also the difference in different languages between statement separators and statement terminators, but I don't really know enough about it.

weinzierl · on April 9, 2021

I always thought this is a Wirthism because Pascal ends unit and program with an "end." (with a dot), whereas function and procedure are terminated with "end;". (with a semicolon). I don't know about other Wirth languages though, maybe it is Pascal specific and not really something typical for Wirth?