MySQL, Postgres etc. all support transparent compression. I'd be curious how sma...

cldellow · on June 24, 2018

I can't speak for MySQL, but I suspect the Postgres compression you're referring to is TOAST (https://www.postgresql.org/docs/current/static/storage-toast...).

Its sweet spot is for much larger rows. In fact, it only kicks in for rows whose content is larger than the page size (2KB or so), so it doesn't trigger for this case, where the average row size is about 120 bytes (and only 80 bytes of that is content).

I bet you could build the DB, stop Postgres, move its data dir to a squashfs filesystem, and then start Postgres in read-only mode for a huge space savings with minimal query cost, though.

Hmm, in fact, it'd be easy to do that with the SQLite DB since it's just a single file. I might give that a shot.

cldellow · on June 24, 2018

Squashing the SQLite file works pretty well -- it's a bit bigger and slower than Parquet, but maybe a reasonable trade off to not have to deal with Parquet.

I added a section at https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html#s... to mention it. Thanks for the inspiration!

Twirrim · on June 25, 2018

On the MySQL side, https://dev.mysql.com/doc/refman/8.0/en/innodb-compression-b....

It reads like row level compression + index compression? They claim indexes make up a fair chunk of the disk usage, so there may be advantages.

olavgg · on June 25, 2018

For analytics/OLAP you can use ZFS compression with a large block size, zstd support is just around the corner too. I would still use compression for a OLTP database, but with much lower block size, max 16kb.