Rust programmers have this "holier than you" attitude that is so toxic.
It's essentially wokeism for programming. No wonder it originates from San Francisco, from all places.
The language itself features interesting ideas, many of them borrowed (pun intended) from Haskell, so not that new after all. But the community behavior proved consistently abysmal. A real put off.
> Brotli's fastest compression is slightly faster than zstd's.
Come on, this is not serious.
Brotli's fastest compression algorithm is still significantly slower than zstd. And more importantly, it compresses _much worse_.
For a 3rd party evaluation, one can try [TurboBench](https://github.com/powturbo/TurboBench) or even [lzbench](https://github.com/inikep/lzbench) which are open-sourced. Squash introduces a wrapper layer with distortions which makes it less reliable, and more complex to use and install, quite a pity given the graphical presentation is very good.
I'm interested in speed, and in this area, all benchmarks point in the same direction : for a given speed budget, Zstandard offers better ratio (and decompresses much faster).
zstd 0.7.1 -22 compresses in 4.01 MB/s to 28363 bytes
Of course it is an unfair example because of the static dictionary that brotli uses, but it is not a pathological example: Thai is not part of the static dictionary. The numbers are on a i7-4790K@4.00 GHz.
Brotli's fastest compression is faster than that of zstd, at least as shown with lzbench and this file. Also brotli wins in compression density. In this file the win is 10.5 % less bytes for brotli -11 than for zstd -22.
> Of course it is an unfair example because of the static dictionary that brotli uses
It is certainly a favorable ground for Brotli.
Brotli claims an advantage in html compression, thanks to its integrated specialized dictionary.
The real pb though is the suggested conclusion that these favorable results are broadly applicable everywhere else.
That's a terrible suggestion.
We need more examples, not just "html files" which happen to be Brotli's best case.
> brotli 0.4.0 -11 compresses to 25413 bytes
> zstd 0.7.1 -22 compresses in 4.01 MB/s to 28363 bytes
Why you don't disclose the compression time of brotli ?
Of course it does matter : everyone understand that an algorithm that spend 10x more cpu has the budget to compress more.
Here, you don't disclose the compression ratio of both algorithm, implying they are equal.
By such standard, LZ4 is probably the best : it's so much faster !
Of course, they do not compress the same...
I was initially thrilled at your detailed answer,
but now, quite frankly, I feel cheated. Grossly so.
This is really disappointing.
I was so much vexed that I decided to run the tests myself.
Downloading and using __the same html file__,
the same lzbench, same library versions, just a different computer and compiler,
here is what it produced :
__Conclusion__ :
brotli -0 is indeed fast, faster than in my previous tests.
It seems to be tuned to reach this objective, but throw away a lot of compression ratio to get there.
Consequently, brotli -0 is not comparable to zstd -1,
it takes brotli -2 to produce an equivalent compressed size .
By that time though, zstd is much, much faster.
Which is exactly the question I'm trying to get answers to :
which algorithm compresses better for a given speed budget ?
That's what matters, at least in our datacenter.
I'm not interested in ultra slow mode, but while at it,
I wanted to complete the picture with the missing compression speed of brotli - 11.
It produced :
zstd -22 : 2.95 MB/s
brotli -11 : 0.53 MB/s
So that's > 5x difference. It surely helps to reach better compression ratios.
I also wanted an answer to "by how much the dictionary helps ?".
Fortunately, TurboBench can help, thanks to a special mode which turns off dictionary compression.
By using it on the very same sample, brotli -11 compressed size increases
from 25413 to 26639 bytes. 5% larger, clearly not negligible.
Still good, but it cuts the advertised size difference in half.
Anyway, clearly I feel disappointed to have to redo the tests myself,
because some inconvenient results were intentionally undisclosed (or not produced).
This really undermines my trust in future publications.
That learned me something : trust only benchmark done by yourself.
And now, I should probably benchmark even more ...
If you read your own results with care you will notice that your results also show that fastest brotli is faster than fastest zstd. Your results also show that brotli compresses more than zstd at higher settings, even if you turn off the static dictionary of brotli.
If you are interested at zstd 0.7.1 -22, you can reach the same compression density with brotli 0.4.0 at quality setting -7 (at least for this file with the static dictionary). Then you are comparing a brotli's compression speed of 57 MB/s to zstd's 4.01 MB/s. Brotli is 14x faster at this compression density.
For decompress-once roundtrip at this density, brotli achieves 53 MB/s, and zstd 4 MB/s. Brotli's roundtrip is 13x faster.
In decompression brotli is 2x slower on Intel, but decompression times at 800+ MB/s are going to be negligible in most use (think < 1 % of cycles in your datacenter), if the data is parsed/processed somehow afterwards.
Brotli's entropy encoding is simpler (no 64-bit operations), and because of this on 32 bit arm the decompression speeds of zstd and brotli are about the same.
I acknowledge that there can be use cases where zstd 0.7.1 can be favorable to brotli 0.4.0 -- particularly those where a 32+ MB file is compressed at once and the 150-500 MB/s compression speed range, but even this simple compression test shows that brotli can compress significantly (5-10+ %) more with the higher quality settings.
> Why you don't disclose the compression time of brotli ?
I left it out because zstd didn't produce comparable output size.
I showed the compression and decompression speed at brotli at quality 7 that already compressed more densely than zstd at maximum setting. Of course it is a somewhat technically flawed comparison, because brotli gets an advantage for this kind of data from its dictionary, but I invite you to test using another set of files. (The three small compression benchmark files discussed earlier in this thread show similar trend.)
The two benchmarks you mention aren't very useful for comparing brotli and zstd, because they deal with large datasets. Per-call overhead matters a lot for small files, and I imagine (particularly brotli) is aimed at small files. Zstd (usefully!) calls out dictionary compression, which hints that small files matter for it too, but I'm not positive there's any specific use case for zstd in mind.
In any case, neither the compression ratios nor the speed of large-file compression necessarily say much about small file performance. There's just much more context to search in a 100MB file than there is in a 10KB file.
Having said that, there's no reason to assume brotli is better for small files; there's just no way to tell given the links you provide.
I'm not affiliated with nor use neither zstd nor brotli.
The language itself features interesting ideas, many of them borrowed (pun intended) from Haskell, so not that new after all. But the community behavior proved consistently abysmal. A real put off.