It is very good to hear there are others leveraging the full capabilities of thi...

NewEntryHN · on Aug 16, 2020

Out of curiosity I wondered what was the biggest Sqlite databse in my filesystem:

find / -name *.sqlite -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1

The winner is `favicons.sqlite` from Firefox profile directory, which is 40 MB.

why_only_15 · on Aug 16, 2020

macos version:

mdfind "kMDItemDisplayName == *.sqlite" -0 | xargs -0 stat "-f%z %N" | sort -nr | head -n 5

should be fairly fast because mdfind uses the spotlight backend and already has this data cached.

Hello71 · on Aug 16, 2020

does it include system files?

why_only_15 · on Aug 17, 2020

Yeah -- the top 5 for me includes photos and notes

proto-n · on Aug 16, 2020

Make sure to include .etilqs files as well

kevincox · on Aug 16, 2020

I found the source. It appears that it is an etilqs_ prefix rather than .etilqs but the story is somewhat humorous:

https://github.com/mackyle/sqlite/blob/3cf493d4018042c70a4db...

rasz · on Aug 16, 2020

Chrome History at ~500MB

koeng · on Aug 16, 2020

I have a ~1TB SQLite database (genetic data). All my queries run really fast, and overall I'm very impressed with the performance.

itroot · on Aug 16, 2020

Yep, that's a great use-case! However on such amounts of data I would recommend to split it to several databases - backup, replication, vacuum will be much easier and better. Doesn't applies to sqlite3 only =) . I assume that you data is immutable or append-only, and there is rare writes with lots of reads?

koeng · on Aug 16, 2020

Yep! The data is for the most part append-only, with massive writes once a month (that's how often the genetic repositories update their data dumps) with a few updates scattered in when annotations change.

What exactly do you mean by split it to several databases? It seems like to me that would make backup and replication and such more difficult, since now I'd have to manage multiple databases. But I don't have experience there, so I'd love to hear if there are easy ways to do that

itroot · on Aug 17, 2020

If you are doing this for analytical things, setup with one database is OK. But, imagine that you a running something on production with sqlite, and database is really big. It is hard to : VACUUMizing, creating indexes and so on. In that case it's great to shard this thing, even it will be several files on one machine (of course, if you have data that can be sharded, like different users data can be stored on different dbs.)

misev · on Aug 16, 2020

The largest I've worked with until now is a bit over 0.5TB. It's still performing as well as if it was a few GB.

tobylane · on Aug 16, 2020

For those of us who haven’t had a few GB SQLite db, does it perform different to one that is a few MB?

tyingq · on Aug 16, 2020

A few MB would likely be fully in the OS buffer cache. I think you would see the difference when it no longer fits.