Its faster, but then, thats the point of it. Its like saying that memcache is fa...

echelon · on Oct 2, 2019

> I still wouldn't trust redis for anything other than ephemeral storage

The Redis AOF (commands flush to disk) and replication story is rock solid.

You can replicate writes to an offline secondary that even has scheduled RDB memory dumps or scheduled AOF rewrites.

We've never encountered data loss issues with our read and write heavy Redis services.

phamilton · on Oct 3, 2019

> We've never encountered data loss issues with our read and write heavy Redis services

Curious... how does one confirm this? A lost acknowledged write is a very obscure thing and I don't think I would know if we lost writes.

echelon · on Oct 3, 2019

Under normal circumstances Redis doesn't lose writes. If it did, we'd be able to detect it in metrics for cache misses or stale data. Our vector clock state machines wouldn't work. We have very fine monitoring for all of these classes of failure.

Operationally, we sequence Redis downtime events. They're very rare. This is when most would be concerned about losing data.

We shift traffic to be hitless. Our model is eventually consistent across multiple regions. We won't accept writes in a cluster that is going down.

The Redis replication chain was chosen so that the primary Redis doesn't have to fork (aofrewrite, bgsave) or even do disk IO (AOF ops in general). We let our offline replicas do that work. The primary Redis can focus on serving reads and writes.

We alert on replication chain breakdown. We sequence traffic draining and primary / secondary swap operations for things like OS or Redis upgrades.

It's pretty sophisticated and largely automated.

This model tolerates losses of Redis instances. Only writes to the primary occurring in the short time before replication while not failed out might be lost. But that number would be incredibly, incredibly small. We would tolerate such losses.

We've got a lot of nines of reliability, and this pattern has scaled and served us well.

phamilton · on Oct 3, 2019

> We've never encountered data loss issues with our read and write heavy Redis services

> We would tolerate such losses

In the context of the original concern, these are conflicting statements.

The underlying argument here is that you shouldn't use redis for anything you can't tolerate losing. Your use case and architecture is great and I'm glad it works well for you. But at the end of the day, there are workloads that can't tolerate such losses. And for those cases, redis is not a good fit.

tracker1 · on Oct 3, 2019

https://aphyr.com/posts/283-call-me-maybe-redis

terranoct · on Oct 2, 2019

It's faster, but only under the specific design condition of both having to wait for storage to finish and not caring about differences between the presented storage options.

If the design condition allows asynchronous operation and I don't need to wait for background tasks to finish before I reply, it's not faster unless I somehow care more about background CPU time per request than I do about things like storage guarantees.

If the design condition requires durable storage to have happened before I reply, putting it in Redis doesn't even count as getting started.

beamatronic · on Oct 2, 2019

Couchbase really is the best of both worlds, in one product. Store in memory first (memcached) and disk later, on 0 or more cluster nodes. From the calling API, you can choose whether to block or not for the disk commit.

Edit: typos

godot · on Oct 2, 2019

This is what I found in my experience as well. I founded a startup in 2013 (and ran it for 3 years) and used Couchbase as our primary data store (a saas that integrates with web sites as customers that received tens of millions DAU). With no dev ops person, running it was easy, and we've never had performance or outage issues. I know you can't compare a document store/kv store with a relational DB like PG or Mysql, but I have wondered why couchbase is not as popular as say Mongo.

magduf · on Oct 2, 2019

I can give you one hint. I don't really work with databases, so I'm not well-versed on them, but I can easily name all the major RDBMSes (PG, MySQL, Oracle, etc.), and some of the NoSQL ones. I've done a little work at home playing with Postgres, but that's about it.

I've certainly heard about MongoDB many, many times. I've heard of Redis some. I've heard of CouchDB. But this is the very first time I've ever heard of "Couchbase".

lstamour · on Oct 2, 2019

It was popular briefly on HN back in 2015, so it had a late start compared to Mongo which was maybe 2012? And the commercialization of memcached was perhaps noticed most by PHP devs, as memcached is common enough with Wordpress and Drupal, though Redis might have overtaken it some. I always thought couchbase was a very interesting idea but never had the chance to seriously suggest it at work. These endorsements actually help me consider it, but at the time it seemed like they didn’t do a great job explaining who was using it and why/how it made a difference. I’ve heard more, recently, about using DocumentDB or CockroachDB than it, actually. And most places I work at are still riding the Postgres/Oracle/SQL traditional database shop, where it’s easier to introduce Redis or Memcached than to switch where all the data is stored. Using it for a new project would require buy in, but might be viable.

tracker1 · on Oct 3, 2019

Mongo's marketing team. The near 1:1 logical mapping for Node.js and other languages with object structures. It's effective, and very easy to use.

Compared to a closer example like RethinkDB, which was really cool, but nowhere near the money and marketing team ultimately failed. Rethink started in terms of stable data first, performance tuning from there... mongo went the opposite approach and caught up on the stability later. Of course scaling mongo and administration isn't as easy as a single node, or multiple read-replicas.

threeseed · on Oct 2, 2019

> but I have wondered why couchbase is not as popular as say Mongo

Because MongoDB is a single binary to download and install and has one of the simplest config files around.

It's arguably the easiest database to use and operate.

saberience · on Oct 2, 2019

Ditto. I loved using Couchbase at my previous job (Blizzard Entertainment) and I'm using at my current gig (FinTech) and I've never understood how such an excellent product has such terrible marketing. It blows Mongo out of the water.

If you're looking for a super low-latency (everything sub-millisecond) document/json store with strong resiliency options, cross-data centre replication, options for optimistic/pessimistic locking, json sub document access and modification, etc etc, I don't know why you'd use anything else.

tombert · on Oct 2, 2019

Bit of a silly question, but how would you say Couchbase compares to something like CouchDB? I've never used Couchbase directly, but I've always really enjoyed the simplicity of CouchDB.

tracker1 · on Oct 3, 2019

Would you be willing to rely on this for something like tracking election results?

It really depends on your specific environment in terms of how much you are, or are not willing to lose information and it's all relative. I'm a fan of high performing databases (couchbase), or those more tunable (ScyllaDB/Cassandra), or build for indexing (elasticsearch), or those built for consistency (sql).

Each have their use cases, and there's definitely overlap. Understanding the difference and using the right tool for the job is important.

xgb84j · on Oct 2, 2019

The article talks about _how much_ it is faster which is the interesting question.

For a small project you might not want the added complexity of another database and decide to store JSON blobs in PostgreSQL. In this case it's important to understand the performance trade-off you're making.

krenoten · on Oct 2, 2019

Just use a hashmap. Hashmaps in some cases can be like 8 orders of magnitude faster than Redis.

phamilton · on Oct 3, 2019

Hashmaps are faster, except when you need the data to be available to multiple machines.

A good description of redis is a "shared heap". HashSets, HashMaps, Heaps, Priority Queues, etc. are all great and fast in an application, but once you need to start sharing that data things get complicated quickly. So you designate a single server to implement those data structures and expose them to your application. And what you end up with is basically redis.

l33tman · on Oct 2, 2019

In Django it's literally a matter of adding 5 lines of code to enable the redis cache. The complexity is in actually setting up the Redis server, but this can also be trivial in many cases.

dsr_ · on Oct 2, 2019

Sorry, no. The complexity is that you are now relying on another server which can break in new ways, requires startup and shutdown and configuration and possibly load-balancing; management of all the above... updates, security reviews, and documentation. Will you need new server types with an emphasis on RAM? How many and when? How are you monitoring it for errors and performance?

That's the difference between thinking about it as a solution to a problem and thinking about it as part of your infrastructure.

volkk · on Oct 2, 2019

True, but a lot of that can be solved with PaaS/DBaaS products like compose or something

threeseed · on Oct 2, 2019

Shame you got downvoted because it's a valid point.

These days most databases are available as a managed service and Redis in particular is a standard feature on AWS, Azure and Google Cloud.

kieselguhr_kid · on Oct 2, 2019

One of the shops I worked in used Redis as the primary datastore. While there was a lot of extra complexity around relational data (especially referential integrity), the snapshotting systems worked very well and we didn't have to worry about data loss. I don't necessarily think this was an ideal setup (I can't tell you how many headaches Postgres would have solved for us), but I think there are plenty of cases where you could use it without fear that your data will evaporate.

jefurii · on Oct 2, 2019

> but I think there are plenty of cases where you could use it without fear that your data will evaporate.

What about "redis-cli flushall"? (Don't type this on your production cluster).

cortesoft · on Oct 2, 2019

What about 'drop database' for Postgres?

phamilton · on Oct 2, 2019

> access denied

ACLs in postgres are powerful and valuable and something that redis lacks.

rowanG077 · on Oct 5, 2019

what about rm -rf?

foobarian · on Oct 2, 2019

Protip: don't use flushall in production, use flushall async or you will have a bad time

rndgermandude · on Oct 2, 2019

Redis can persist and replicate just fine.

Aside from lack of ACID you better hope your dataset never gets too large to fit in memory, tho.

zug_zug · on Oct 3, 2019

It's actually worse than that. It's apples to oranges because it's asking one service to do X and one service to do Y.

You can configure postgres to behave more like Redis, and it will be faster if you do (at the cost of transactional guarantees).

A more interesting test might have been Postgres configured-like-redis vs Redis.