Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't agree with the whole 'useless use of cat' meme: cat creates an output stream, which in turn allows you to build up the subsequent command bit by bit and allows you to cut and past chunks from useful building blocks you keep in a file.

The first command changes all the time so by using 'cat' those blocks remain reusable, everything will use the piped input that cat sends if there is no preceding program, rather than to have to insert < somefile in the middle of the first command.



Totally with you there. "Useless use of cat" is usually good practice, not bad.

It's clearer - the structure indicates right at the front what it is going to do, namely read a file and pass it through a pipeline. There is no need to read ahead to find out what the source material is.

It's safer - "cat" is a read-only operation, once you've written that command up-front there is no longer a risk of overwriting the original file with a typo in the rest of the pipeline.

It's simpler to construct and nicely orthogonal to the rest of the pipeline - you can write the "cat" and then season the rest to taste (as you suggested).

I will occasionally remove cat from a very heavily-used loop, but as a default style it's fine.


> It's clearer - the structure indicates right at the front what it is going to do, namely read a file and pass it through a pipeline. There is no need to read ahead to find out what the source material is.

I only found this out recently, but this works perfectly fine:

$ < /some/file awk ...

You're not wrong though, where cat improves readability there's no harm in using it.


"It's safer - "cat" is a read-only operation, once you've written that command up-front there is no longer a risk of overwriting the original file with a typo in the rest of the pipeline."

With zsh you can prevent such accidents by "setopt NO_CLOBBER"

The result is that if you "foo > bar" and "bar" exists, zsh will refuse to overwrite "bar" and give you an error: "zsh: file exists: bar"

This makes constructs such "foo < bar > baz" perfectly safe, because accidentally typing "foo > bar > baz" will error out when "bar" (or "baz") already exists.

(PS: if you want to force zsh to overwrite the file even when NO_CLOBBER is set you can "foo >| bar")


That doesn't help with sed -i and similar things.

Zsh stops redirection errors, it won't help even if cat is in the front.


"That doesn't help with sed -i and similar things."

Can you give an example? I don't know what you mean.

"Zsh stops redirection errors"

Which is what the post I was answering to was complaining about, wasn't it?

"it won't help even if cat is in the front"

Why not? "cat > foo" will error out if "foo" exists and NO_CLOBBER is set.


>> "That doesn't help with sed -i and similar things."

> Can you give an example? I don't know what you mean.

Sure. The first line below is dangerous no matter what zsh does to save you from yourself. The second line is safe no matter which shell you are using, and no matter what other commands are in the pipeline:

    sed $SEDOPTIONS "s/$SEARCHTERM/$REPLACEMENT/g" $FILENAME

    cat $FILENAME | sed $SEDOPTIONS "s/$SEARCHTERM/$REPLACEMENT/g"

>> "Zsh stops redirection errors"

> Which is what the post I was answering to was complaining about, wasn't it?

That's not how I interpreted "a typo in the rest of the pipeline." Sure, the typo could be a redirection. It could also accidentally set $SEDOPTIONS in the example above to include the '-i' flag.

>> "it won't help even if cat is in the front"

> Why not? "cat > foo" will error out if "foo" exists and NO_CLOBBER is set.

Yes, "cat > foo" will error out, but "cat $FILENAME | sed -i "s/a/b/g" $FILENAME" won't.


Yes, in-place editing with tools like sed is dangerous.

But, in your own example you have a useless use of cat:

  cat $FILENAME | sed $SEDOPTIONS "s/$SEARCHTERM/$REPLACEMENT/g"
could be replaced with:

  sed $SEDOPTIONS "s/$SEARCHTERM/$REPLACEMENT/g" < $FILENAME


Well, that's the whole point of the useless use of cat - in the non-cat example you gave, making a typo at the end of the command destroys my data.

With the useless use of cat, that is no longer possible even if there is a typo in the SEDOPTIONS.


> With zsh you can prevent such accidents by "setopt NO_CLOBBER"

Don't even need zsh for this; "set -C" will do the same in any POSIX shell. >| is also in POSIX. csh supports it as well.


It's possible to avoid `cat` and still keep the filename separate from the rest of the command by placing the redirection at the start of the command:

    < /tmp/com.txt awk ...
It might look a little odd, but it's portable.


I don't know about you, but I usually like seeing what's in the file before I spend time writing a awk command on it.

    cat myfile.txt                   # check the file because I'm not even sure that's the correct name
    cat myfile.txt | awk blablabla   # no I didn't forget awk syntax, I swear
I would be afraid of any person that after this, goes back and rewrites the beginning of the line to replace cat with a redirection.

Even with "ctrl-a alt-d < alt-f alt-f ctrl-d ctrl-d ctrl-e" that replaces cat with < and removes the pipe, which any emacs user can pull off in it's sleep


If you use zsh then just "<myfile.txt" should work too, which opens the file in $PAGER. Doesn't seem to work in bash though, with the default config anyway, but maybe it can be configured.

I don't care if people use "cat" or "<" and the whole "useless use of cat" is stupid >99% of the time, but I've gotten in to the habit of using <file as it's shorter to type ("<file cmd" vs. "cat file | cmd").


I usually use head or less for that.

Using cat for first look is usually looking for trouble if the file is huge or some binary file.


I'd agree with avoiding the < somefile redirection because it's sorta annoying to grok in one-liners. But in this case the file would just go after the awk command.


You can do

    <somefile awk ... | ...
The redirect doesn't have to appear "out of order" at the end of the command.

In zsh you can even do just `<somefile` to print it (equivalent to `cat somefile`, very useful for reading files into variables) although in bash that appears to do nothing.


In this particular case why is

    cat /tmp/com.txt | awk '{ print substr($1, 1, length($1) -5) }' | uniq > domains.txt
preferable to:

    awk '{ print substr($1, 1, length($1) -5) }' /tmp/com.txt | uniq > domains.txt




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: