Versioned finite-state machines with Postgres (2019)

jpnc · 2024-07-25T10:20:04 1721902804

One thing I'm missing when modeling FSM like this is different states having different set of constraints, even if only being concerned with nullity. It's a shame having to make the field optional just because you do not have the appropriate value in the initial state of the entity.

aargh_aargh · 2024-07-25T12:05:07 1721909107

For any use-case I can think of, which are all "small data" (under 1B rows), I'd rather use DB migrations to take care of this problem rather than forcing myself to live with my previous design decisions indefinitely.

I am attracted to clever but in practice, simple always beats clever.

ragebol · 2024-07-25T13:28:39 1721914119

Wouldn't that risk some FSM getting modified while it's 'running'? That may or may not be desirable I guess. Eg. a customer not knowing that the process changed since they placed an order and not understanding the process anymore

michelpp · 2024-07-25T18:11:23 1721931083

This is a nice blog and I like the versioning feature, if you're looking to explore this idea more I made a very simple unversioned FSM extension for Postgres years ago for fun and no profit:

https://github.com/michelp/pgfsm

klysm · 2024-07-25T14:08:22 1721916502

I definitely like materializing the transitions into a table instead of the switch statement

zinclozenge · 2024-07-25T17:13:33 1721927613

I always wondered about this, a while ago I saw a library similar to this that modeled FSMs in Postgres, and had a comment saying "todo: cycle detection". And made me wonder if it's even possible to do this.

wchargin · 2024-07-25T20:25:23 1721939123

Hmm… consider the following. Your FSM is acyclic iff you can assign each state an integer depth such that a state at depth d only ever transitions to states at depths strictly greater than d. So consider the following tables:

    CREATE TABLE states (
        state order_state PRIMARY KEY,
        depth int NOT NULL,
        UNIQUE(state, depth)
    );
    CREATE TABLE transitions (
        start_state order_state NOT NULL,
        event order_event NOT NULL,
        end_state order_state NOT NULL,
        start_depth int NOT NULL,
        end_depth int NOT NULL,
        CONSTRAINT transitions_start_depth_correct
            FOREIGN KEY(start_state, start_depth) REFERENCES states(state, depth)
            ON UPDATE CASCADE,
        CONSTRAINT transitions_end_depth_correct
            FOREIGN KEY(end_state, end_depth) REFERENCES states(state, depth)
            ON UPDATE CASCADE,
        CONSTRAINT transitions_depth_increases CHECK(end_depth > start_depth),
        PRIMARY KEY(start_state, event)
    );

Let's bang on it for a quick test. You can define a state machine; here's one that roughly matches the regex `^(AB|BA)$` (I know I'm being a bit sloppy):

    fsm=> CREATE DOMAIN order_state AS text;
    CREATE DOMAIN
    fsm=> CREATE DOMAIN order_event AS text;
    CREATE DOMAIN
    fsm=> -- "CREATE TABLE"s as above, then:
    fsm=> INSERT INTO states(state, depth) VALUES('start', 1), ('a', 2), ('b', 2), ('done', 3), ('error', 3);
    INSERT 0 5
    fsm=> INSERT INTO transitions(start_state, event, end_state, start_depth, end_depth) VALUES('start', 'A', 'a', 1, 2), ('start', 'B', 'b', 1, 2), ('a', 'B', 'done', 2, 3), ('b', 'A', 'done', 2, 3), ('a', 'A', 'error', 2, 3), ('b', 'B', 'error', 2, 3);
    INSERT 0 6

And, as you need to modify it, you can increase a node's depth to make room for intervening nodes:

    fsm=> UPDATE states SET depth = 4 WHERE state = 'error';
    UPDATE 1
    fsm=> TABLE transitions;
     start_state | event | end_state | start_depth | end_depth 
    -------------+-------+-----------+-------------+-----------
     start       | A     | a         |           1 |         2
     start       | B     | b         |           1 |         2
     a           | B     | done      |           2 |         3
     b           | A     | done      |           2 |         3
     a           | A     | error     |           2 |         4
     b           | B     | error     |           2 |         4
    (6 rows)

But you can't decrease a node's depth too far:

    fsm=> UPDATE states SET depth = 2 WHERE state = 'error';
    ERROR:  new row for relation "transitions" violates check constraint "transitions_depth_increases"
    DETAIL:  Failing row contains (a, A, error, 2, 2).
    CONTEXT:  SQL statement "UPDATE ONLY "public"."transitions" SET "end_state" = $1, "end_depth" = $2 WHERE $3::pg_catalog.text OPERATOR(pg_catalog.=) "end_state"::pg_catalog.text AND $4 OPERATOR(pg_catalog.=) "end_depth""

And you can't introduce transitions that don't increase depth:

    fsm=> INSERT INTO transitions(start_state, event, end_state, start_depth, end_depth) VALUES('done', 'AGAIN!', 'start', 3, 1);
    ERROR:  new row for relation "transitions" violates check constraint "transitions_depth_increases"
    DETAIL:  Failing row contains (done, AGAIN!, start, 3, 1).

Now, I don't know that I would immediately recommend this for high-throughput production use. You're storing "unnecessary" state not once but many times (each state's depth appears `1 + \deg(v)` times), plus additional indices and lookups. But I do think it meets the desired consistency goals!

zinclozenge · 2024-07-27T15:19:58 1722093598

Amazing! I also learned about domains with your comment.

onion-soup · 2024-07-25T10:02:18 1721901738

[flagged]

maxbond · 2024-07-25T10:13:40 1721902420

Flexibility and constraint are both useful. (Constraint is one of the reasons to use a finite state machine in the first place.) An overly constrained system can't function, an overly flexible system can't be relied upon. You'll answer your question the day you encounter a problem that NoSQL is a bad fit for, and after you learn SQL it won't feel so alien or complicated. Or maybe that day will never come and you'll never feel the need to learn SQL.

For a motivating example, consider a billing or accounting system. This domain is well known, we probably don't need the ability to rapidly evolve our schema. If we violate the constraints of this system, there may be serious consequences, like spending money we don't have or billing for the service twice.

We could build this system with either an NoSQL or a SQL database, but SQL would seem to me to be the natural choice.

cricketlover · 2024-07-25T12:07:26 1721909246

I understand what you're saying, but in the example for accounting, we can also solve this problem using NoSQL. Because the most important feature we're talking about there is transaction support. Similarly, schema-on-write can be provided by a library.

To me it seems like NoSQL works better when there is less to normalize, which is the case with microservices. Those services struggle with the support for a distributed transaction when they have to make a distributed transaction. This problem will be very easily solved in SQL (assuming its not shared to completely denormalize everything for performance).

Note that this normalization problem also shows up in schema-on-write. If multiple people are contributing to a schema from different teams, then it will become hard to maintain a schema-on-write.

pratio · 2024-07-25T11:45:20 1721907920

NoSQL will bring it's own set of issues right? Too flexible of a system with no built in Validations would mean that they need to be handled somewhere else. If we take the example in the post, refund should not precede awaiting payment, If a new status gets added, it becomes easy to know where the migrations have to be run and in NoSQL, either we write something custom or handle it each time the document is called.

onion-soup · 2024-07-31T10:32:44 1722421964

not sure what you are onto but SQL is just old and useless, glad i never had to learn it

mtkd · 2024-07-25T12:07:58 1721909278

Surely that logic should not sit in persistance layer though?

sgarland · 2024-07-26T00:19:25 1721953165

I mean, you shouldn't be throwing shit at the DB and seeing what sticks, but why not have the absolute source of truth also be the validator? CHECK constraints are fairly cheap.

maxbond · 2024-07-26T01:30:12 1721957412

They also derisk deploys of application code. If you deploy a bug that doesn't uphold your constraints, you get failures rather than corruption.

mjhay · 2024-07-25T15:16:17 1721920577

Looks like normal SQL syntax to me. That's not alien to most practitioners.

klysm · 2024-07-25T14:07:54 1721916474

Alien syntax is not a valid critique