Hacker News new | past | comments | ask | show | jobs | submit login
Feature comparison of ack, ag, git-grep, GNU grep and ripgrep (beyondgrep.com)
244 points by infodroid on Jan 8, 2018 | hide | past | favorite | 100 comments



The chart was created by Andy Lester, the creator of ack, who thinks that more open source projects should point to their "competing" projects because it's not really a competition: http://blog.petdance.com/2018/01/02/the-best-open-source-pro...


Emphasis mine-

Excerpt: "Some might say that ag and ripgrep and any of the other tools I list on beyondgrep.com are competing projects, but I think that way of thinking is wrong. It’s only a competition if you see it as a competition. I’m not competing against anyone for anything: Clicks, dollars, popularity, etc. If someone uses ripgrep instead of ack, it doesn’t hurt me. It’s the difference between an abundance vs. scarcity view of the world. I choose abundance. I think most of us who work in open source do, too."

I never thought I would see the confluence of the "woo-woo" space abundance mindset and a blog post about an open source command-line utility. I must say I am intrigued.

If this explanatory medium doesn't substantiate the reasonability of a mindset of abundance in the minds of programmers, I don't know what ever will.


I don't read that as a name-it-claim-it spell. I read it as an accuracy claim. That the competition is a self-fulfilling prophecy. I don't know what's "woo-woo" about that.


What is woo-woo space?


I’d wager they’re talking about the whole “The Secret” abundance mentality thing. “Believe and you shall receive... somehow”. That sort of interesting New Age stuff. I put no credence in it myself in terms of their stronger claims, but hey, CBT reminds me of it in a lot of ways and helped cure my depression.


Author here. I don't know about "The Secret" other than it being an Oprah thing, what, ten years ago?

Abundance vs. scarcity has nothing to do with “Believe and you shall receive... somehow”.

Scarcity thinking means that you fear giving people credit, and letting others have success. You fear that praising others makes you seem weak. You think that if someone uses a different open source project than yours, that you or your project suffers. You might not even be aware of the thinking. You might just feel it reflexively.

Abundance thinking says that there is more than enough praise to go around. It says that your success doesn't hurt me in mine (unless in some tangible way it does). It says that you can use ag and I can use ack and Susan can use ripgrep and it's all good.

In this specific case, abundance says that I, as the creator of ack, don't need to own the "market". In fact, the creation of other tools only helps ack. It gives the ack team ideas for things we can implement in ack. Who am I to think that I'm the only one with good ideas?

It gives our users a wider choice of tools. I put my work out there publicly to help people. Why would I not want them to have a variety to choose from?

That's abundance vs. scarcity.


Well said. :-)

I've added a link to your feature comparison table to my REAMDE[1]. Thanks again for putting it together!

[1] - https://github.com/BurntSushi/ripgrep#feature-comparison


Involuntary Neil Stephenson reference :-)


Oh I wasn’t having a go at you, merely answering the question posed by the parent and grandparent commenters here :)


Your view on the world shapes how you perceive the world (yes I've learned that in the tautology club) and as your reality is only what perceive, your view on reality shapes how you see it.

Not much woo woo involved.


I mean, if you're trying to make a living, it is a competition.


The table is misleading isn't it? It seems to imply that rg can't work recursively, but [1] states "..ripgrep defaults to recursive directory search...". I understand that the table might mean that rg doesn't have a flag to enable recursive search, but surely we care about features more than flags...

[1] https://github.com/BurntSushi/ripgrep#why-should-i-use-ripgr...


The table doesn't yet distinguish between lacking a command line flag due to absence of a feature vs due to the feature being the default. This is similar to the situation with case-sensitive search in GNU grep, which ends up with a blank cell even though it is the default. See: https://github.com/beyondgrep/website/issues/72

It's a fair criticism but I don't think it is designed to be misleading.


Maybe not designed to be misleading, but misleading nonetheless. For `grep`, `grep -i needle` is (approximately) equal to `ack/ag needle`, but looking at the table I wouldn't think grep supports case sensitivity.


I've updated the phrasing so it says "Re-enable case-sensitive search over case-insensitve or smart-case search".


It's almost more useful as a rosetta stone than a feature comparison.


Author here. Yes, that's really what it is. I need to make one that's actually feature comparison more than phrasebook.

We're also working on a GNU grep vs. POSIX grep vs. BSD grep phrasebook.


I don't think it would take very much to go from rosetta stone to a more comprehensive feature comparison.

For example, the table could show the blank cell in a different color if it is a default feature, or if it an unsupported feature.

Alternatively, the cell could contain a message such as "(default)".


It's going to be two pages ultimately. If you're comparing features, you don't care about the how.

Right now we're keeping all the data in a JSON file that we massage into a chart. Massaging it into a true feature comparison chart, along with a rosetta stone, should be a simple matter of programming. Same thing with the GNU/POSIX/BSD rosetta stone we're working on.

https://github.com/beyondgrep/website/blob/dev/comparison.js...


I agree, but I know in my own project that my "competition" is moving, and I don't follow them. Thus any time like the linked one that I create quickly become out of date as they add new features.


Here to plug using `--passthru`/`--passthrough`: it will print all lines, but highlight matches. I often do things like this to watch an output log, but highlight all entries with the string PLUGH in them:

   tail -F output.log | ag --passthrough '.*PLUGH.*'


You can achieve that in almost any of these tools (including old-school grep) by matching against something with zero width, like this:

  grep '^|.*PLUGH.*'


You could do that.

Or you could just use --passthrough. It's easy to remember, easy to type, it autocompletes, it doesn't interfere with your criteria.


I wondered what that option could be useful for, thanks!


The equivalent in grep is '--line-buffered'.


That's an orthogonal feature, it writes output after each line as opposed to every 4096 bytes, when the output is a pipe instead of a terminal. Useful when the other end of the pipe still goes to the terminal and you want to see it immediately. If `some-util-with-output` echoes stdin then without the option the following would not show you the latest grepped lines until the 4096 buffer fills.

  tail -F output.log | grep --line-buffered TEXT | some-util-with-output


huh?


I mean you can use grep as a filter for tail -f output.


Seems pretty comprehensive. One confusing thing I found was these two: "Don't respect ignore files (.gitignore, .ignore, etc)" vs "Skip rules found in VCS ignore files (.gitignore, .hgignore, etc)"

Aren't those the same thing? Shouldn't they be grouped for better comparison?


I’m stuck with rg right now because it’s the only ine which correctly handles gitignore files. Generally quite happy with it but I wish it could also use a more powerful regex engine for some less common cases.

Most annoying thing is that $ does not work with windows newlines.


Have you tried sift? https://sift-tool.org/


sift addresses neither of the GP's concerns with ripgrep. It's also much much slower than ripgrep on almost anything other than simple literal scans.


ack doesn't do git ignore right? The matrix suggests it does, and I thought it did, but I don't use it routinely.


It supports a subset but not a very good one. At least it fails to handle leading slashes.


ack does not look in .gitignore files at all.


Not exactly sure what it is, but I really like this presentation format.

Thank you, have been trying to motivate myself to switch to ack/ag for a while. This seems like it might be what I needed.


Consider just switching to ripgrep instead. It's faster, and the default flags and interface are more thoughtful. This chart may make it seem less feature rich, but most of the 'features' it's missing are thing you'll never need, or things that your shell should be responsible for ("Pipe output through a pager or other command"?).

The only serious feature you might miss is lookahead/lookbehind in regexes - that's missing by design since if you want guaranteed linear time search you can't have those.


> "Pipe output through a pager or other command"

I really like the discussion that lead to the decision of never including a pager:

https://github.com/BurntSushi/ripgrep/issues/86


Good use of color as well as a visual reference for when the tools share a feature. I agree it's a good presentation.

As for search, I'm on Windows and I like to use FileLocator Pro.


The one thing I wish more of these comparison tables had was licensing info, so for those like me who are curious:

1. GNU grep - GPLv3+

2. ack - Artistic License v2.0

3. ag (aka silver searcher) - Apache License 2.0

4. git-grep - GPlv2+/LGPLv2.1+

5. rg - MIT license

So with maybe the exception of rg, all are gpl compatible, that's great news.


Anything that is permissively licensed (like ripgrep) is generally GPL compatible.[1] Note also that ripgrep is dual licensed under the Unlicense or the MIT license, both of which are explicitly GPL compatible according to [1].

[1] - https://www.gnu.org/licenses/license-list.en.html


I'm going to be making a new chart that is more about features than command flags. I'll include the licensing information. Thanks. https://github.com/beyondgrep/website/issues/80


A nice feature of ag is the ability to limit the search to a certain file type. E.g. to only search ruby files: ag --ruby foo.

To see a list of supported types and the matching file extensions: ag --list-file-types.

Just checked and rg does have something similar, but you need to specify the type as an argument to a flag: rg --type ruby foo.


ack does the "limit by a certain filetype", and it also lets you define your own file types or extend existing ones. ag does not.


You can shorten it with -t: -tc, -tphp, -truby, -tsh. Then it's actually the same amount of characters as with ag.


And has the advantage of being pluggable rather than a hard-coded list of file types (though IIRC ack has a configuration file whereas with rg you need to use an alias to set up new types for every invocation).


ack has a configuration file, and it's extremely flexible, including the ability to check shebang lines. If you have a shell script without an extension, it's the only way to know what language it is.


This chart doesn't compare speed. Until rg, I didn't grep much because I generally search through a lot of files. rg cuts through them in no time.


No, the chart doesn't compare speed. Lord knows there have been many comparisons of the speeds of grep-alikes, but nobody has written up a comparison of features.

For me, raw speed is not as important as a rich feature set to support my code spelunking.


The `sift` [1] tool presents a performance comparison, but not against `rg`.

[1] https://sift-tool.org/performance


ripgrep's introductory blog post[1] includes a perf comparison, which incorporates sift. But sift is too slow to include in several benchmarks. sift's achievement is its fast parallel directory traverser, coupled with Go's vectorized IndexByte[2] function for simple literals. In that case, it is quite fast, but as soon as you enter Go's regex engine, it's game over.

[1] - http://blog.burntsushi.net/ripgrep/

[2] - https://golang.org/pkg/bytes/#IndexByte


Unfortunately that page does not say when the comparison was made, and which version of each tool was tested. Also, which "grep" is that? I assume GNU grep? There are others, though...

All in all, it'd still be nice to have a more comprehensive performance comparison page, which gets regularly updated. Bonus points if it shows how speed changes over time, similar to http://speed.pypy.org (the code for that is available, by the way).

Wishful thinking, I know, but hey, who knows... :-)


There is more detail available in the benchmark runs at https://github.com/BurntSushi/ripgrep/tree/master/benchsuite...

However, those are from 2016, and so it's hard to tell what might have changed in the meanwhile.


The benchmark suite can be run by anyone: https://github.com/BurntSushi/ripgrep/blob/master/benchsuite...

I re-ran them :-) https://github.com/BurntSushi/ripgrep/tree/master/benchsuite...

TL;DR ripgrep has gotten faster in important areas since the initial set of benchmarks (the proper comparison there would be https://github.com/BurntSushi/ripgrep/tree/master/benchsuite...). The key reasons why are because it grew a parallel directory traverser, and its line counting got vectorized courtesy of the bytecount[1] crate. ucg has gotten a little faster in some cases, but the general conclusion of "ripgrep is the fastest" is still correct.

[1] - https://github.com/llogiq/bytecount


Thank you for re-running them. You saved me the trouble. ;)

I would be curious to see ack get into the test suite, however. Even if it is much slower, I'd like to see the results.

And I'd be very curious to hear your reasoning for the different results in the subtitles_ru test cases -- why is rg returning radically different numbers of lines as compared to the other tools?

Thanks!


ack will always be slower than ripgrep, but it shouldn't be as slow as it is in burntsushi's tests. In his tests, he's showing run times where ack takes 25x as long to run as ripgrep, and ack shouldn't be NEARLY that slow.

We think that there's something weird about his Perl installation that is making it so slow, but we haven't been able to figure it out.

Here's the ticket: https://github.com/beyondgrep/ack3/issues/42

If you have any insight, we'd love to have it. We've been stumped, as you'll see if you read through the issue history.


I will dig back into this and see if I can figure it out. If you look at the recent commit history for ripgrep, you'll see I updated the timings for ack on my benchmark in my README. I have no explanation for it, but ack isn't as slow as it was when I went through this before.

Anyway, my Perl installation is the standard one on Archlinux. I will try on other systems.


> I would be curious to see ack get into the test suite, however. Even if it is much slower, I'd like to see the results.

You'll need to add it to the benchsuite script (which should be very easy to do, just peruse the source to see other examples), but for me, ack is too slow to benchmark this way. In theory, I'd be fine adding it to the same benchmarks as pt/sift are in, since they are also generally too slow to benchmark, but are at least fast enough in some of them to tolerate it. But ack has different characteristics. While pt/sift have a very high ceiling (like ack), they also have a very low floor in some cases. ack on the other hand has a reasonably high floor compared to the others, even in the simplest searches. This makes all benchmarks on ack take a long time.

I did a couple ad hoc benchmarks on the same machine:

                                   ripgrep         ack
    linux_alternates                0.113s      9.750s
    linux_alternates_casei          0.133s     19.955s
    linux_literal                   0.103s      7.220s
    linux_literal_casei             0.122s      8.025s
    linux_no_literal                0.356s     18.881s
    linux_re_literal_suffix         0.104s      6.778s
    linux_unicode_greek             0.194s      8.537s (ack reports no results)
    linux_unicode_word              0.111s      7.299s
    linux_word                      0.108s      6.763s
    
    subtitles_en_alternate          0.247s      9.829s
    subtitles_en_alternate_casei    0.247s     43.091s
    subtitles_ru_alternate          0.978s     28.134s
    subtitles_ru_alternate_casei    0.978s    107.314s (ack reports incorrect)
    subtitles_ru_surrounding_words  0.245s      6.633s (ack reports no results)
The subtitles benchmarks are perhaps unfair because I think ack is more focused on directory tree search where as ripgrep claims to be good at both. I included a few anyway to show the difference though. In general, the benchmark just isn't that interesting, and it makes the benchmark run take a lot longer than it would otherwise (because each command is executed several times).

> And I'd be very curious to hear your reasoning for the different results in the subtitles_ru test cases -- why is rg returning radically different numbers of lines as compared to the other tools?

Because ripgrep correctly supports Unicode, and does it by default because it can generally handle all Unicode features without a corresponding performance loss. GNU grep handles Unicode in general as well (assuming your system's locale settings are up to snuff), but it can pay a huge price for it some cases, although admittedly, I'd consider such cases to be somewhat infrequent in common usage. It's explained in my blog post: http://blog.burntsushi.net/ripgrep/#single-file-benchmarks --- The subtitles_no_literal is particularly interesting, because it shows what happens when you ask GNU grep to do the correct thing. ;-)

Note that both ag and ucg have the opportunity to support Unicode correctly, but they don't twiddle the right flags in their use of PCRE (and PCRE2, respectively). AFAIK, neither expose a flag to twiddle these things. From scanning the ack man page, I don't see any option there either, although I'm sure Perl regexes probably have that option too.


Awesome! Thanks for all the information!

I feel kinda sad that the only thing I can do in response is to go install ripgrep and use that instead of the alternatives.

Do you have a Patreon page? Or anything similar?


Haha go for it! :-)

And no, I don't mix money with my free time side projects. Personal choice. Instead, just donate to a charity. My personal favorites are Rails Girls and Wikipedia. The Internet Archive is another good one!


Nice feature-level comparison.

Don't know how you'd pull it off in the current tabular format, but would be great to include some of the UX side of things: For example, I use rg almost exclusively because it has an easy-to remember syntax (`rg search-string` is a recursive search), attractively colored and well-organized output, and seems much faster compared to other tools in my use cases.


as a long time user of `find . -name "*.foo" -exec grep -Hin {} \;` moving to ack has been great! I love the syntax and the speed and the fact that it actually respects your ignore files. ripgrep is great too. ag on the other hand is recommended by everyone but doesn't seem to respect ignores or understand modern ignore syntax. give it a pass.

https://github.com/ggreer/the_silver_searcher/issues/385

its been over 4 years... its had its chance.


Spawning a separate grep for each file? That's terribly inefficient. At least use xargs which will run one process on as many files as possible.

But you know there's a -r for recursive, right? And unless you are using some historic relic of grep that is not GNU or BSD and doesn't understand the --include option you can just do:

grep -rin "needle" --include "*.foo" .


I use something along the lines of

    find . -iname '.*' -prune -o -iname '*.foo' -exec grep needle {} +
I don't want to have to learn 10 different command syntaxes for walking directory trees, so find works well. The "+" terminator of find's exec is similar to xargs, but preserves the flexibility of find's exec.


At the least, I recommending moving past `find . --name '*.foo' -exec grep` pattern if you can help it.

Modern file searchers, of which `ag` is one among others, accepts `--filenametype` argument and skips the `.git` subdirectory if it exists (by default, can be toggled), so `ag --python needle` will recursively search for needle in the the current working tree in all files whos filenames end in `.py`.

Yes, you could write a function to do the same in `find`, but then you're just being stubborn. (Which is fine; my .bashrc is littered with aliases and functions of me being stubborn, but if I have to copy my .bashrc file around, I might as well install my preferred searcher to the target system if available.)


Or just use "grep". It is what it's there for.

Use --include and --exclude-dirs to do what you describe. (The latter is a good idea to set in your GREP_OPTIONS, unless you actually search .git directories.)


GREP_OPTIONS is deprecated in GNU grep since i think 2.20. It will be removed in a future version, and until then it's going to print an irritating warning message every time you use it.

(I don't think there's any such plan for the various BSD greps, so if you use those exclusively you're probably fine.)


Good to know. I've been considering dropping ACK_OPTIONS, and this just might put me over the edge.


I love ag for one purpose: as a lightning quick “find other instances of the currently highlighted word” helper for Emacs. It’s great for that.


Seems everybody here has switched to rg but I haven't because it's not available in Debian repos (yet) unlike ack/ag. So, do such folks use Arch or just download a pre-built binary ? How about updating rg when a new version is out ?


The easiest way: first, install rust and cargo, either through your distribution, or thorough rustup if your distribution doesn't have it yet. Then run "cargo install -f ripgrep". It'll download the source code, build, and install to ~/.cargo/bin, which rustup adds to $PATH for you by default.

Edit: the "-f" in the "cargo install" command is for updating; without the -f, it refuses to install over an already installed version. The first time you install, you can omit the -f.


Note that if you install Rust through Debian, it likely won't be new enough to compile the latest version of ripgrep. I believe Debian packages Rust 1.14, and the last version of ripgrep to work on Rust 1.14 was 0.5.2. So, `cargo install --vers 0.5.2 ripgrep` might be what you want on Debian.


Debian stable has 1.14, but Debian testing has 1.22.1.


rg has been in OpenBSD packages since 6.1


rg for the win. It's not some huge improvement, but it's more of a feeling that things just work by default (similar to what I get from tmux vs screen).


I've got the reverse feeling recently, as ag does smartcase searching by default and rg doesn't...


`alias rg='rg -S'` in your `.bashrc` will fix that for you.

Out of curiosity, what sort of searches do you do that smartcase is desirable? A meaningful number of people seem to prefer it, but I find that most of the time I want to be able to search for variables etc. case sensitively.


We were talking about defaults and feelings here, I'm aware that it's easy enough to fix (and to be fair, I now often use ripgrep with the inverse alias)…

I'm generally a big fan of smartcase. Most of the time I don't care about case and it's easier to type it that way, and when I care, it's often mixed or upper case, so smartcase Does What I Mean. And for the few times when I explicitly search for all-lowercase, it's easy enough to turn it off (M-c, -s, :set nosmartcase etc.).


I added smartcase in ack because I used it constantly in vim. I don't normally want to have to remember if the function I'm searching for is "format_ISBN" or "format_isbn", for example. Some languages (PHP) aren't always case-sensitive, so you need to search both. I'd rather use a "-I" in the few cases where I don't want case-sensitive, than having to remember to add "-i" in the 99% of the cases where I don't.


Ack and ag use Perl/PCRE so they're more useful than grep/rg if you know Perl style regular expressions.


GNU grep supports PCRE as well.


switched from ag to rg a few months ago and have nothing but good things to say about the experience.


Same here.

Ripgrep works like a charm

Ack requires Perl, which I am not going to install just for that purpose.


What kind of OS are you running that Perl isn't installed already?


An ideal one.

Like, platonic ideal- because it doesn't exist. That greatly ideal.

I don't know why I care, other than even the super cut down perl being a sizable % of the install size for some of the very small systems I've worked with.


Sorry if this sounds rude, but how do you not have perl already installed? It's come with every distro I've ever used.


I switched also recently, the thing that helped me was adding this:

alias rg='rg -S'

to my .bashrc so that rg had the same default case sensitivity as ag.


I do wish the persistent config file enhancement had a significant chance of becoming a reality. https://github.com/BurntSushi/ripgrep/issues/196

I understand alias'es and wrappers fine, but I like having the environment have some way to contribute. It's just my simple preference.

There's also the related #314 which makes even more sense- per project configuration. Sure would be great being able to download a project & have it already setup nicely for ripgrep! https://github.com/BurntSushi/ripgrep/issues/314


> I do wish the persistent config file enhancement had a significant chance of becoming a reality.

As the ticket says (I think), tt's going to happen. It's just a lot of tedious work.


Wow, cool. Thanks burntsushi. rg is already amazing. This project really has gotten an enormous amount of love & effort from you & it really shows through & through.

Your implementation of the gitignore algorithm is crazy impressive to me.


I made the switch too (from grep to rg). I really like it so far. It is a little more faster than grep for my use!


Any modern tools like this?

sary - a suffix array library and tools http://sary.sourceforge.net/

It finds words in O(log(n)) time by using an additional index.


The problem with suffix arrays---even with a blazing fast SACA---is that they are slow. It will take a long time to generate an index for even a moderately sized code repository.

Typically, if you want an index, you build an inverted index, which maps terms (e.g., n-grams or tokens in your favorite PL) to a postings list. The postings list contains all of the documents in which that term occurs.


> No descending into subdirectories

> Limit directory search depth

These could be the same item, the former is a special case of the latter.


The table somehow missed MIT grep.


What has always amazed me is how super fast ack is in my use cases.


I've been using pt for a few years... am I dumb or do I have a secret?


Neither. pt is part of the benchmark suite I published when I introduced ripgrep[1]. TL;DR - It's fast for simple literal searches, but that's it.

[1] - http://blog.burntsushi.net/ripgrep/


Missing `git grep --name-only`


There's a link to the issue tracker right there at the top of the page.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: