Hacker News new | past | comments | ask | show | jobs | submit login

But the static dictionary is not used as is, but transformed with a lot of options. For example both `<!DOCTYPE html>` and `<!doctype html>` appear in the dictionary, and the transform 43 (FermentAll) will turn them into `<!DOCTYPE HTML>`. (No transform would generate `<!doctype HTML>` though.) Therefore Brotli is not only following the herd, so to say. And the static dictionary only improves a very short input---any long-term pattern will be captured regardless of the static dictionary.



I believe that when I investigated this a few years ago I tried it on a few things and found it consistent.

Right now, I take my HN front page and add <!DOCTYPE html> and <!doctype html>. Uppercase becomes 5678 bytes; lowercase 5675 bytes, three bytes smaller.


While Brotli tries to use all those transforms, I guess it is still not exhaustive enough to fully make use of them, given that Brotli (and many others) is asymmetric in design and compressors can be heavily tweaked without affecting decompression too much.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: