The use case is caching 20 million API responses that almost never change, each about 20kb of JSON, for a high traffic site.
Yes, I can pay for a 400Gb RAM instance of Redis, but it's expensive.
I can also cache it on disk, but then I need to think about cache expiration myself.
Or I can use something appropriate like a document database, but then I need additional code & additional configuration because we otherwise don't need that piece of infrastructure in our stack.
It would be a lot easier if I could just store it in Redis with the other (more reasonably sized) things that I need to cache.
In other abuses of SQLite, I wrote a tool [0] that exposes blobs in SQLite via an Amazon S3 API. It doesn't do expiry (but that would be easy enough to add if S3 does it).
We were using it to manage a millions of images for machine learning as many tools support S3 and the ability to add custom metadata to objects is useful (harder with files). It is one SQLite database per bucket but at the bucket level it is transactional.
Redis Data Tiering - Redis Enterprise and AWS Elasticache for Redis support data tiering (using SSD for 80% of the dataset and moving things in and out). On AWS, a cache.r6gd.4xlarge with 100GB of memory can handle 500GB of data.
Local Files
> I can also cache it on disk, but then I need to think about cache expiration myself.
Is the challenge that you need it shared among many machines? On a single machine you can put 20 million files in a directory hierarchy and let the fs cache keep things hot in memory as needed. Or use SQLite which will only load the pages needed for each query and also rely on the fs cache.
S3 - An interesting solution is one of the SQLite S3 VFS's. Those will query S3 fairly efficiently for specific data in a large dataset.
You could try using Amazon S3 Express, a low-latency alternative for S3 buckets [0]. I imagine cache invalidation would be relatively simple to implement using lifecycle policies.
Or shard it - divide your objects up based on some criteria (hash the name of the object, use the first N digits of the hash to assign to a shard), and distribute them across multiple redis instances. Yes, you then need to maintain some client code to pick the right redis instance to fetch from, but you can now pick the most $/memory efficient instance types to run redis, and you don't have to worry about introducing disk read latency and the edge cases that brings with it.
Edit: looks like redis has some built-in support for data sharding when used as a cluster (https://redis.io/docs/latest/commands/cluster-shards/) - I haven't used that, so not sure how easy it is to apply, and exactly what you'd have to change.
You're trying to get redis to be what it isn't. Use a thing that has the properties you want: a document or relational database. If you insist on this then running a system that allows a ton of swap onto a reasonably fast disk might work, but is still gonna perform worse than a system that's designed for concurrently serving queries of wildly differing latencies.
Have you looked at varnish for caching api responses? Varnish let's you back it with disk and relies on page cache to keep more accessed items in memory.
If the reverse proxy thing doesn't work I think memcached has two level storage like that now iirc