> most speed gain tricks in ripgrep have nothing to do with string algorithms Th...

nialo · on May 14, 2017

Is your suffix array implementation open source/available online anywhere? I'm always curious to see what other people's version of these sorts of things look like.

(I wrote an implementation of Ukkonen's suffix tree algorithm in C as maybe my second C program, it was pretty frustrating for a long time)

burntsushi · on May 14, 2017

Yup: https://github.com/BurntSushi/suffix --- You can see a brief description of it here (with links to references): https://docs.rs/suffix/1.0.0/suffix/struct.SuffixTable.html#...

> (I wrote an implementation of Ukkonen's suffix tree algorithm in C as maybe my second C program, it was pretty frustrating for a long time)

Oh dear. You are brave. I briefly considered implementing Ukkonen's algorithm, but ran away scared. :-)

The reason why I wrote the SAIS implementation was because I am generally interested in text search, and because I was interested in exploring the idea of building an index around suffix arrays. Once I finished my implementation and realized that the cutting edge SACA algorithm (at the time) was as slow as it was, I mostly gave up that idea.

Since then, I've been itching to find use cases for it. As a former computational biologist, I'm familiar with how they are used in that field, but I've been looking for other places as well. Haven't found one yet, although I can think of some specialized (hypothetical) things where it might be useful.