Because that is what the project does. But I suspect that the analysis would loo...

kevincox · on June 16, 2023

I guess that is mostly true. In theory you could think of a more efficient way to store the indexes such as run-length encoding. So you could store 1M of whatever the first byte of Pi is and then run-length encode the 1M zero addresses. You can also imagine a scheme such as 2 bit varint encoding the indexes or a tally system that only uses 1 bit per offset to store a zero.

...of course you are still better off to just compress the file consisting of a single repeated byte.