Hat, not CAP: Introducing Highly Available Transactions

jmileham · on Feb 5, 2013

In order to reconcile ACID with CAP, this defines a weakened form of ACID to mean whatever-some-databases-currently marketed-as-ACID-compliant-support in order to say that you can still offer effective ACID compliance and still choose CA over partition tolerance (in the http://codahale.com/you-cant-sacrifice-partition-tolerance/ sense). For a lot of applications, the weakened isolation guarantees aren't, or shouldn't be, negotiable (if you try to sneak by without them, they'll cause data integrity issues at scale).

Not saying that the solution doesn't provide a valuable framework for building robust applications that can overcome those issues (necessarily pushing some of that complexity up the stack to the application developer), but the marketing seems a little bit suspicious?

Edited to add: In fairness, the article doesn't actually claim to have evaded CAP - it recognizes that HAT is a compromise. But I believe it's easy to understate the practical problems with non-serializable transactions. It becomes impossible to prevent duplicate transactions from being created on the split-brain nodes. In banking, for instance, this would be a Bad Thing, and lead to potentially hairy application-specific mop up when the nodes resync.

pbailis · on Feb 5, 2013

Good point, and well-taken. As I mention in http://www.bailis.org/blog/hat-not-cap-introducing-highly-av... (and devote an full section to in the paper, including documented isolation anomalies like lost updates, write skew, and anti-dependency cycles), there are many guarantees that aren't achievable in a highly available environment. Our goal is to push the limits of what is achievable, and, by matching the weak isolation provided by many databases, hopefully provide a familiar programming interface.

As I tried to stress in the post, we aren't claiming to "beat CAP" or provide "100% ACID compliance"; we're attempting to strengthen the semantic limits of highly available systems. I intended "HAT, not CAP" as a play on acronyms, not as a claim to achieve the impossible.

edit: We're also certainly not claiming to have a "CA" solution, whatever that means. There's a lot of confusion between "CAP atomicity"==linearizability and "ACID atomicity"=="transactional atomicity"/"all or nothing"; see http://www.bailis.org/blog/hat-not-cap-introducing-highly-av...

prodigal_erik · on Feb 5, 2013

> matching the weak isolation provided by many databases, hopefully provide a familiar programming interface

I'm not sure it's really that familiar. Just knowing how to make requests doesn't ensure you really understand all the ways the answers could be wrong, much less have done the analysis and proven you can withstand all those failure modes. I think a lot of systems out there are quietly corrupting themselves in ways the maintainers didn't have high enough scale or good enough analytics to notice, at least not early enough to recover to a valid state.

haberman · on Feb 6, 2013

> In order to reconcile ACID with CAP, this defines a weakened form of ACID to mean whatever-some-databases-currently marketed-as-ACID-compliant-support

Are you referring to the isolation guarantees? "Repeatable Read" (which this provides) is a pretty reasonable standard of isolation; while "Fully Serializable" is stronger, it's also more expensive. Engines like PostgreSQL that can be run in either mode are most often run in "repeatable read" mode AFAIK.

jmileham · on Feb 6, 2013

Good point, repeatable read is a pretty useful guarantee, though I would be loathe to give up global integrity constraints. Read committed seems to put a lot of work on the application developer's plate, though, and it's not clear to me what impact the different isolation levels have on system performance.

3amOpsGuy · on Feb 5, 2013

To be fair to the article, It's pretty upfront about the constraints. From the article:

'Of course, there are several guarantees that HATs cannot provide. Not even the best of marketing teams can produce a real database that “beats CAP”; HATs cannot make guarantees on data recency during partitions, although, in the absence of partitions, data may not be very stale. HATs cannot be “100% ACID compliant” as they cannot guarantee serializability'

My concern is the pitched low latency use case, if I understand correctly there's no way to avoid an extra round trip?

Could be very useful all the same.

pbailis · on Feb 5, 2013

> if I understand correctly there's no way to avoid an extra round trip?

With HATs, you only need to contact one replica for every key. This is to goal behind our definition of "high availability" (http://www.bailis.org/blog/hat-not-cap-introducing-highly-av...).

In general, I haven't seen algorithms guaranteeing serializability or atomicity that complete without a round trip to at least one other replica (or a possibly long trip to the master). Intuitively, the impossibility results dictate that this must be the case, otherwise partitioned replicas could safely serve requests. Daniel Abadi has a great post about this latency-consistency trade-off: http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-an...

ryanpers · on Feb 9, 2013

Interesting paper, I hope to see a follow on that actually describes the algorithm in full. As written, it doesn't cover the failure recovery, data-drift and timeout cases.

Also maybe you could speak to a few constraints: - missing updates - unique index

and outline your thoughts as to how an application developer might avoid pitfalls. Most applications I have seen tend to require/run in to these issues.