Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is very good to hear there are others leveraging the full capabilities of this software.

We haven't broken the 100GB barrier for a single SQLite database file yet, but we have strong confidence that everything will simply continue working as expected once we do.



Out of curiosity I wondered what was the biggest Sqlite databse in my filesystem:

find / -name *.sqlite -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1

The winner is `favicons.sqlite` from Firefox profile directory, which is 40 MB.


macos version:

mdfind "kMDItemDisplayName == *.sqlite" -0 | xargs -0 stat "-f%z %N" | sort -nr | head -n 5

should be fairly fast because mdfind uses the spotlight backend and already has this data cached.


does it include system files?


Yeah -- the top 5 for me includes photos and notes


Make sure to include .etilqs files as well


I found the source. It appears that it is an etilqs_ prefix rather than .etilqs but the story is somewhat humorous:

https://github.com/mackyle/sqlite/blob/3cf493d4018042c70a4db...


Chrome History at ~500MB


I have a ~1TB SQLite database (genetic data). All my queries run really fast, and overall I'm very impressed with the performance.


Yep, that's a great use-case! However on such amounts of data I would recommend to split it to several databases - backup, replication, vacuum will be much easier and better. Doesn't applies to sqlite3 only =) . I assume that you data is immutable or append-only, and there is rare writes with lots of reads?


Yep! The data is for the most part append-only, with massive writes once a month (that's how often the genetic repositories update their data dumps) with a few updates scattered in when annotations change.

What exactly do you mean by split it to several databases? It seems like to me that would make backup and replication and such more difficult, since now I'd have to manage multiple databases. But I don't have experience there, so I'd love to hear if there are easy ways to do that


If you are doing this for analytical things, setup with one database is OK. But, imagine that you a running something on production with sqlite, and database is really big. It is hard to : VACUUMizing, creating indexes and so on. In that case it's great to shard this thing, even it will be several files on one machine (of course, if you have data that can be sharded, like different users data can be stored on different dbs.)


The largest I've worked with until now is a bit over 0.5TB. It's still performing as well as if it was a few GB.


For those of us who haven’t had a few GB SQLite db, does it perform different to one that is a few MB?


A few MB would likely be fully in the OS buffer cache. I think you would see the difference when it no longer fits.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: