The Anatomy of a Durable Execution Stack from First Principles

stsffap · 2025-02-20T15:08:49 1740064129

Hi everyone, I am helping building Restate. If you want to try out deploying a distributed Restate cluster, then you can do this with only a few commands. All you need is Docker and to follow our guide here: https://docs.restate.dev/guides/cluster.

Let us know, what you think about it :-)

whoiskatrin · 2025-02-20T15:51:39 1740066699

Great read. How does it handle out-of-order events or scenarios where events need to be processed in strict order across different partitions? Are there built-in mechanisms to enforce global ordering when needed?

stsffap · 2025-02-20T16:29:49 1740068989

If events need to be processed in strict order across different partitions, then you need to send these events to a single key (and thereby to a single partition). A partition consists of multiple service keys for which Restate ensures strict order processing. It is noteworthy that different keys don't block each other from being executed (no head of line blocking across different keys).

The order in which events are processed for each key is their arrival order. If you need to handle out-of-order events, then you can implement this as part of a virtual object which can store events and re-order them based on other events that carry some form of watermark or based on time.

sewen · 2025-02-20T14:17:51 1740061071

The post discusses the design considerations when building a durable execution runtime from the ground up.

The goal is a highly-available, transactional, scalable, and low latency runtime in a self-contained binary that scales from laptop to complex distributed deployment.

oulipo · 2025-02-20T14:44:36 1740062676

Interesting. What would be the advantages of Restate compared to Inngest or Dbos?

stsffap · 2025-02-20T15:05:04 1740063904

We also put a lot of energy into making operations of Restate as simple as possible. We learned it the hard way when building Apache Flink that operating a distributed system is challenging especially if it relies on other external systems like ZooKeeper. Therefore, Restate comes as an all-batteries-included single binary that does not need any external dependencies. So you don't have to understand and operate multiple systems. Moreover, you can start with a single node deployment and later turn it into a multi-node deployment by "simply" starting new processes that connect to the existing cluster.

Restate itself sits in between your services and your user's requests. It is designed to push invocations to your service endpoints which allows it to play nicely together with serverless platforms such as AWS Lambda, Cloud Run functions, etc.

p10jkle · 2025-02-20T14:54:01 1740063241

As discussed in the article, we have built our own storage engine from the ground up, which we did because we believe it will achieve better performance by taking advantage of the features of the system (streaming data, single writer etc) instead of shoehorning it into a DBMS. So, our performance goals are very high throughput (100s of thousands of actions per second, scaling horizontally), with very low latencies (like, 40ms p90 under load for a 3 step workflow)

jedberg · 2025-02-20T19:24:32 1740079472

> instead of shoehorning it into a DBMS

Disclaimer, I'm the CEO of the aforementioned DBOS.

That's an interesting way to phrase it. We like to think that we've taken advantage of 50 years of development on DBMS by optimizing how it is used. We also take advantage of the fact that your application is already accessing the database for application data, and we sit right next to it, not on another service. So our added latency is in the single digit milliseconds (an order of magnitude faster than any external solution).

Since we are on the same database as your application data, our throughput scales with your application seamlessly as you scale your database to meet your application needs. It's part of our lightweight promise for durability -- no external services required.