Its like saying that memcache is faster than postgres.
they do different things for different purposes.
I still wouldn't trust redis for anything other than ephemeral storage. Most of the use cases where we use redis assume that the data will at some point go away. The places where we use postgres assume that data is permanent.
Infact, we use both together, and it works really well, best of both worlds.
Under normal circumstances Redis doesn't lose writes. If it did, we'd be able to detect it in metrics for cache misses or stale data. Our vector clock state machines wouldn't work. We have very fine monitoring for all of these classes of failure.
Operationally, we sequence Redis downtime events. They're very rare. This is when most would be concerned about losing data.
We shift traffic to be hitless. Our model is eventually consistent across multiple regions. We won't accept writes in a cluster that is going down.
The Redis replication chain was chosen so that the primary Redis doesn't have to fork (aofrewrite, bgsave) or even do disk IO (AOF ops in general). We let our offline replicas do that work. The primary Redis can focus on serving reads and writes.
We alert on replication chain breakdown. We sequence traffic draining and primary / secondary swap operations for things like OS or Redis upgrades.
It's pretty sophisticated and largely automated.
This model tolerates losses of Redis instances. Only writes to the primary occurring in the short time before replication while not failed out might be lost. But that number would be incredibly, incredibly small. We would tolerate such losses.
We've got a lot of nines of reliability, and this pattern has scaled and served us well.
> We've never encountered data loss issues with our read and write heavy Redis services
> We would tolerate such losses
In the context of the original concern, these are conflicting statements.
The underlying argument here is that you shouldn't use redis for anything you can't tolerate losing. Your use case and architecture is great and I'm glad it works well for you. But at the end of the day, there are workloads that can't tolerate such losses. And for those cases, redis is not a good fit.
It's faster, but only under the specific design condition of both having to wait for storage to finish and not caring about differences between the presented storage options.
If the design condition allows asynchronous operation and I don't need to wait for background tasks to finish before I reply, it's not faster unless I somehow care more about background CPU time per request than I do about things like storage guarantees.
If the design condition requires durable storage to have happened before I reply, putting it in Redis doesn't even count as getting started.
Couchbase really is the best of both worlds, in one product. Store in memory first (memcached) and disk later, on 0 or more cluster nodes. From the calling API, you can choose whether to block or not for the disk commit.
This is what I found in my experience as well. I founded a startup in 2013 (and ran it for 3 years) and used Couchbase as our primary data store (a saas that integrates with web sites as customers that received tens of millions DAU). With no dev ops person, running it was easy, and we've never had performance or outage issues. I know you can't compare a document store/kv store with a relational DB like PG or Mysql, but I have wondered why couchbase is not as popular as say Mongo.
I can give you one hint. I don't really work with databases, so I'm not well-versed on them, but I can easily name all the major RDBMSes (PG, MySQL, Oracle, etc.), and some of the NoSQL ones. I've done a little work at home playing with Postgres, but that's about it.
I've certainly heard about MongoDB many, many times. I've heard of Redis some. I've heard of CouchDB. But this is the very first time I've ever heard of "Couchbase".
It was popular briefly on HN back in 2015, so it had a late start compared to Mongo which was maybe 2012? And the commercialization of memcached was perhaps noticed most by PHP devs, as memcached is common enough with Wordpress and Drupal, though Redis might have overtaken it some. I always thought couchbase was a very interesting idea but never had the chance to seriously suggest it at work. These endorsements actually help me consider it, but at the time it seemed like they didn’t do a great job explaining who was using it and why/how it made a difference. I’ve heard more, recently, about using DocumentDB or CockroachDB than it, actually. And most places I work at are still riding the Postgres/Oracle/SQL traditional database shop, where it’s easier to introduce Redis or Memcached than to switch where all the data is stored. Using it for a new project would require buy in, but might be viable.
Mongo's marketing team. The near 1:1 logical mapping for Node.js and other languages with object structures. It's effective, and very easy to use.
Compared to a closer example like RethinkDB, which was really cool, but nowhere near the money and marketing team ultimately failed. Rethink started in terms of stable data first, performance tuning from there... mongo went the opposite approach and caught up on the stability later. Of course scaling mongo and administration isn't as easy as a single node, or multiple read-replicas.
Ditto. I loved using Couchbase at my previous job (Blizzard Entertainment) and I'm using at my current gig (FinTech) and I've never understood how such an excellent product has such terrible marketing. It blows Mongo out of the water.
If you're looking for a super low-latency (everything sub-millisecond) document/json store with strong resiliency options, cross-data centre replication, options for optimistic/pessimistic locking, json sub document access and modification, etc etc, I don't know why you'd use anything else.
Bit of a silly question, but how would you say Couchbase compares to something like CouchDB? I've never used Couchbase directly, but I've always really enjoyed the simplicity of CouchDB.
Would you be willing to rely on this for something like tracking election results?
It really depends on your specific environment in terms of how much you are, or are not willing to lose information and it's all relative. I'm a fan of high performing databases (couchbase), or those more tunable (ScyllaDB/Cassandra), or build for indexing (elasticsearch), or those built for consistency (sql).
Each have their use cases, and there's definitely overlap. Understanding the difference and using the right tool for the job is important.
The article talks about _how much_ it is faster which is the interesting question.
For a small project you might not want the added complexity of another database and decide to store JSON blobs in PostgreSQL. In this case it's important to understand the performance trade-off you're making.
Hashmaps are faster, except when you need the data to be available to multiple machines.
A good description of redis is a "shared heap". HashSets, HashMaps, Heaps, Priority Queues, etc. are all great and fast in an application, but once you need to start sharing that data things get complicated quickly. So you designate a single server to implement those data structures and expose them to your application. And what you end up with is basically redis.
In Django it's literally a matter of adding 5 lines of code to enable the redis cache. The complexity is in actually setting up the Redis server, but this can also be trivial in many cases.
Sorry, no. The complexity is that you are now relying on another server which can break in new ways, requires startup and shutdown and configuration and possibly load-balancing; management of all the above... updates, security reviews, and documentation. Will you need new server types with an emphasis on RAM? How many and when? How are you monitoring it for errors and performance?
That's the difference between thinking about it as a solution to a problem and thinking about it as part of your infrastructure.
One of the shops I worked in used Redis as the primary datastore. While there was a lot of extra complexity around relational data (especially referential integrity), the snapshotting systems worked very well and we didn't have to worry about data loss.
I don't necessarily think this was an ideal setup (I can't tell you how many headaches Postgres would have solved for us), but I think there are plenty of cases where you could use it without fear that your data will evaporate.
Its like saying that memcache is faster than postgres.
they do different things for different purposes.
I still wouldn't trust redis for anything other than ephemeral storage. Most of the use cases where we use redis assume that the data will at some point go away. The places where we use postgres assume that data is permanent.
Infact, we use both together, and it works really well, best of both worlds.