transactional's comments

transactional · 2025-04-21T07:30:40 1745220640

(Hi! Post author here.)

It is written with a lean towards serializable, partly because there's a wide variety of easy examples to pull which all implement serializable, but the ideas mostly extend to non-serializable as well. Non-serializable but still MVCC will also place all of their writes as having happened at a single commit timestamp, they just don't try to serialize the reads there, and that's fine. When looking at non-serializable not MVCC databases, it's still useful to just try to answer how the system does each of the four parts in isolation. Maybe I should have been more direct that you're welcome to bend/break the mental model in whatever ways are helpful to understand some database.

The line specifically about MySQL running at serializable was because it was in the Spanner section, and Spanner is a (strictly) serializable database.

karmakaze · 2025-04-23T01:03:40 1745370220

Thanks for the clarifications and diagrams. I can see how using something like Spanner from the outset makes sense to use and stick with serializable isolation. With other SQL dbs, I've mostly seen repeatable read, read committed, and even read uncommitted used in the name of performance. Read committed works fine but you have to design everything for it from the start with thoughtful write and read sequences.

Moving to serializable should be easy but isn't in the case of Spanner and the like because you can't make 100+ of sub-millisecond queries to respond to an API request if that's how your app evolved.

The way I imagine the future is to bring the code closer to the data like stored procedures, but maybe in a new way like modern languages compiled to run (and if necessary retry) in a shard of the database.

transactional · 2024-11-11T01:49:48 1731289788

I'll rephrase the line sometime, but the intention is to communicate that it's the default choice to go with if you're making this decision in 2024. Most of the projects which aren't using some form of direct IO are doing it because they predate O_DIRECT (e.g. postgres), didn't architect for using async IO (e.g. mongo), or have excessive portability concerns which means they only offer it as an option (e.g. innodb).

transactional · on Sept 21, 2023

A recent blog post about MongoDB's SBE detailed that the purpose behind their VM is that it serves as a way to re-use an execution component between two different query languages. SQLite's claim was just that the strong separation between compilation and execution makes issues easier to debug.

I wouldn't expect VMs to become the default design in databases, but it seems like it's getting increasingly common as an IR for query compilation. The ability to have a (comparatively simpler) interpreter for the VM also means you can apply simple fuzzing to great effect: if the results of interpretation vs compilation ever diverge, there's a bug.

cmrdporcupine · on Sept 21, 2023

MonetDB has been doing this (VM for query execution) since 2002-2004ish.

anarazel · on Sept 21, 2023

Another related aspect is that the VM approach can allow JITing only some of the opcodes, which makes it more feasible and maintainable.

transactional · on April 3, 2023

It’s interesting to me that in the original proposal for fixing fsync to actually be durable (https://lwn.net/Articles/270891/), there was thought given to the desire for a non-flushing write barrier (ctrl-f SYNC_FILE_RANGE_NO_FLUSH), but it appears that it never got implemented.

transactional · on Aug 9, 2022

There's one set of folk working on a btree, and another set of folk are now working on a RocksDB storage engine.

The original preference away from RocksDB was that it doesn't play well with deterministic simulation. Any code included into FDB needs to be able to be able to run with coroutines (strongly preferably stackless ones, though sqlite's btree has a stackful coroutine shimmed under it). RocksDB is definitely not written to support coroutines, and thus trying to use it anyway results in sacrificing developers' abilities to dig into failures.

Redwood has a couple design decisions that would make it a poor general purpose btree, but a great one for FoundationDB. But RocksDB will still have write and space amplification advantages.

transactional · on March 15, 2022

If you don’t like being told what to do, boy do I have bad news about PhDs for you...

transactional · on March 14, 2022

(Disclaimer: affiliated)

Redwood isn't "completed", but it's not unavailable. Evaluating it via `fdbcli> configure ssd-redwood-1-experimental ` is encouraged, but it hasn't seen sufficiently deep evaluation and verification that it's been set as the default ssd storage engine yet.

Not all releases got blog posts. Which ones did honestly had far more to do with the people involved in the releases at the time than any technical merits. This is the first feedback that I've seen where the blog posts were used as a signal of worthiness or stability, so we'll see if we can try to be a bit more responsible about making release posts.

Releases that are posted to the downloads page are all equally considered "ready to go". They appear a bit slower than what gets tagged on github, as the production environments in the core set of companies supporting FDB development are used as the last stage of QA before an official release. (Though this distinction might be less clear as the downloads page was recently redirected to the github release artifacts page.)

You can find the 7.0 release notes at https://github.com/apple/foundationdb/blob/main/documentatio..., which will appear in the documentation once the public site is updated to 7.0 (which happens once the first official, public release is blessed).

All that said, the regular release cadence so far has been about every 6 months, which I think still does qualify as "glacial".

native_samples · on March 15, 2022

Well, every six months for a core DB engine seems OK. I don't know if I'd describe that as glacial, but the problem is, how would anyone know that without serious investigation? There's a 7.0 release that isn't marked as beta or anything, yet, the release notes can only be found in git under a directory named "sphinx". It's very ambiguous whether this is released or not, which is weird, and doesn't send positive signals about the overall organization and focus of the project. The message it sends is that Apple have no interest in increasing FDB usage outside of Apple, basically, which from an end user's perspective increases risk. What if Apple decide it's time to move on, or close development again? A community of users and devs helps, but that's not going to emerge unless the community is a priority. Maintaining the website/blog would be step one in that process, I'd think.

transactional · on March 12, 2022

I understand nothing of helm, but https://github.com/FoundationDB/fdb-kubernetes-operator/pull... seems to suggest yes?