It struck me that Jepsen has identified clear situations leading to invariant violations but Datomic’s approach seems to have been purely to clarify their documentation. Does this essentially mean the Datomic team accepts that the violations will happen, but don’t care?
From the article:
> From Datomic’s point of view, the grant workload’s invariant violation is a matter of user error. Transaction functions do not execute atomically in sequence. Checking that a precondition holds in a transaction function is unsafe when some other operation in the transaction could invalidate that precondition!
As Jepsen confirmed, Datomic’s mechanisms for enforcing invariants work as designed. What does this mean practically for users? Consider the following transactional pseudo-data:
[
[Stu favorite-number 41]
;; maybe more stuff
[Stu favorite-number 42]
]
An operational reading of this data would be that early in the transaction I liked 41, and that later in the transaction I liked 42. Observers after the end of the transaction would hopefully see only that I liked 42, and we would have to worry about the conditions under which observers might see that 41.
This operational reading of intra-transaction semantics is typical of many databases, but it presumes the existence of multiple time points inside a transaction, which Datomic neither has nor wants — we quite like not worrying about what happened “in the middle of” a transaction. All facts in a transaction take place at the same point in time, so in Datomic this transaction states that I started liking both numbers simultaneously.
If you incorrectly read Datomic transactions as composed of multiple operations, you can of course find all kinds of “invariant anomalies”. Conversely, you can find “invariant anomalies” in SQL by incorrectly imposing Datomic’s model on SQL transactions. Such potential misreadings emphasize the need for good documentation. To that end, we have worked with Jepsen to enhance our documentation [1], tightening up casual language in the hopes of preventing misconceptions. We also added a tech note [2] addressing this particular misconception directly.
To build on this, Datomic includes a pre-commit conflict check that would prevent this particular example from committing at all: it detects that there are two incompatible assertions for the same entity/attribute pair, and rejects the transaction. We think this conflict check likely prevents many users from actually hitting this issue in production.
The issue we discuss in the report only occurs when the transaction expands to non-conflicting datoms--for instance:
[Stu favorite-number 41]
[Stu hates-all-numbers-and-has-no-favorite true]
These entity/attribute pairs are disjoint, so the conflict checker allows the transaction to commit, producing a record which is in a logically inconsistent state!
On the documentation front--Datomic users could be forgiven for thinking of the elements of transactions as "operations", since Datomic's docs called them both "operations" and "statements". ;-)
In order for user code to impose invariants over the entire transaction, it must have access to the entire transaction. Entity predicates have such access (they are passed the after db, which includes the pending transaction and all other transactions to boot). Transaction functions are unsuitable, as they have access only to the before db. [2]
Use entity predicates for arbitrary functional validations of the entire transaction.
Datomic transactions are not “operations to perform”, they are a set of novel facts to incorporate at a point in time.
Just like a git commit describes a set of modifications, do you or should you want to care about which order or how the adds, updates, and deletes occur in a single git commit? OMG no, that sounds awful.
The really unusual thing is that developers expect intra-transaction ordering to be a thing they accept from any other database. OMG, that sounds awful, how do you live like that.
Yeah, this basically boils down to "a potential pitfall, but consistent with documentation, and working as designed". Whether this actually matters depends on whether users are writing transaction functions which are intended to preserve some invariant, but would only do so if executed sequentially, rather than concurrently.
Datomic's position (and Datomic, please chime in here!) is that users simply do not write transaction functions like this very often. This is defensible: the docs did explicitly state that transaction functions observe the start-of-transaction state, not one another! On the other hand, there was also language in the docs that suggested transaction functions could be used to preserve invariants: "[txn fns] can atomically analyze and transform database values. You can use them to ensure atomic read-modify-update processing, and integrity constraints...". That language, combined with the fact that basically every other Serializable DB uses sequential intra-transaction semantics, is why I devoted so much attention to this issue in the report.
It's a complex question and I don't have a clear-cut answer! I'd love to hear what the general DB community and Datomic users in particular make of these semantics.
As a proponent of just such tools I would say also that "enough rope to shoot(?) yourself" is inherent in tools powerful enough to get anything done, and is not a tradeoff encountered only when reaching for high power or low ceremony.
It is worth noting here that Datomic's intra-transaction semantics are not a decision made in isolation, they emerge naturally from the information model.
Everything in a Datomic transaction happens atomically at a single point in time. Datomic transactions are totally ordered, and this ordering is visible via the time t shared by every datom in the transaction. These properties vastly simplify reasoning about time.
With this information model intermediate database states are inexpressible. Intermediate states cannot all have the same t, because they did not happen at the same time. And they cannot have different ts, as they are part the same transaction.
When we designed Datomic (circa 2010), we were concerned that many languages had better support for lists than for sets, in particular list literals and no set literals.
Clojure of course had set literals from the beginning...
An advantage of using lists is that tx data tends to be built up serially in code. Having to look at your tx data in a different (set) order would make proofreading alongside the code more difficult.
Yes. Perhaps this is a performance choice for DataScript since DataScript does not keep a complete transaction history the way Datomic does? I would guess this helps DataScript process transactions faster. There is a github issue about it here: https://github.com/tonsky/datascript/issues/366
I think the article answers your question at the end of section 3.1:
> "This behavior may be surprising, but it is generally consistent with Datomic’s documentation. Nubank does not intend to alter this behavior, and we do not consider it a bug."
When you say, "situations leading to invariant violations" -- that sounds like some kind of bug in Datomic, which this is not. One just has to understand how datomic processes transactions, and code accordingly.
I am unaffiliated with Nubank, but in my experience using Datomic as a general-purpose database, I have not encountered a situation where this was a problem.
This is good to hear! Nubank has also argued that in their extensive use of Datomic, this kind of issue doesn't really show up. They suggest custom transaction functions are infrequently written, not often composed, and don't usually perform the kind of precondition validation that would lead to this sort of mistake.
Yeah, I've used a transaction functions a few times but never had a case where two transaction functions within the same d/transaction ever interacted with each other. If I did encounter that case, I would probably just write one new transaction function to handle it.
Sounds similar to the need to know that in some relational databases, you need to SELECT ... FOR UPDATE if you intend to perform an update that depends on the values you just selected.
From the article:
> From Datomic’s point of view, the grant workload’s invariant violation is a matter of user error. Transaction functions do not execute atomically in sequence. Checking that a precondition holds in a transaction function is unsafe when some other operation in the transaction could invalidate that precondition!