Soft Deletes with Ecto and PostgreSQL

Terr_ · on Aug 17, 2024

Soft-deletes in relational databases are almost always a mistake. Most use-cases are actually business rules/processes which need to be clearly examined and done more explicitly and contextually, at least if you ever want to restore anything.

For example, you don't soft-delete a customer row, you "deactivate" a customer, and change logic that looks at customer-related entities to fit that. The "Customers" page that shouldn't show those anymore gets retitled to "Active Customers." Later there will be either process for reactivating customers or for using a permanently deactivated customer as a template for a new one.

What goes wrong with the classic soft-delete approach? The one where every table has a boolean is_deleted column and hopefully the deleting code updates the correct subgraph of records at the same time?

It typically rots as you expand your model and add features. It's not clear what significance different tables have, and you you can't "restore" things with any reliability or safety. That may be because of relational/index conflicts, or because you don't have enough information about how it was originally deleted to actually "restore" something that is both correct and useful, or because there are actually 3 different kinds of restoration schemes/scopes but current problem calls for a fourth.

lelanthran · on Aug 17, 2024

> What goes wrong with the classic soft-delete approach? The one where every table has a boolean is_deleted column and hopefully the deleting code updates the correct subgraph of records at the same time?

> It typically rots as you expand your model and add features. It's not clear what significance different tables have, and you you can't "restore" things with any reliability or safety. That may be because of relational/index conflicts, or because you don't have enough information about how it was originally deleted to actually "restore" something that is both correct and useful, or because there are actually 3 different kinds of restoration schemes/scopes but current problem calls for a fourth.

The long and short of soft deletes is that it breaks my data's correctness guarantees WRT referential integrity: i.e. I 'delete' a record while there are still references to it, and the RDBMS cannot even warn me, much less stop me!

If I wanted broken data, I'd use a NoSQL DB, or flat files even.

lmz · on Aug 17, 2024

> I 'delete' a record while there are still references to it, and the RDBMS cannot even warn me, much less stop me!

Can't you do multi column FKs ( (otherid, active) references other(id, active) )?

ivanhoe · on Aug 17, 2024

Also makes UNIQUE indices much harder to use. You can't just delete and later re-insert the same data, you need to "undelete" it instead, but that can actually mess up the chronology of the data, so then you need a separate table to track dates... and things just get over-engineered quickly...

ivanhoe · on Aug 17, 2024

> For example, you don't soft-delete a customer row, you "deactivate" a customer

I'm all for naming things more precise, but functionally speaking at that point it's really just semantics, you still end up with some DB column acting as a flag, and from then on you need to take it into account in every single query that touches that table in the whole app. Since most modern ORMs know how to handle soft-delete internally, it's far easier to just stick to the defaults and use `deleted_at` if you really need a way to keep the records around. And you often need, for the referential integrity of the historical data.

Terr_ · on Aug 17, 2024

> at that point it's really just semantics

Nah, There's a structural difference between (A) adding a status column to one table and changing your queries and joins to care/not care about it, versus (B) adding is_deleted to every table and layering get in as a bonus where clause everywhere.

In other words, The thing being solved is not suitable for a cookie-cutter table/row-level design. It's a real business-logic quality of your data model that lives in particular places and mean slightly different things in those different places.

cachehit · on Aug 17, 2024

This is a problem but IMO, the bigger problem is that you create a pit of failure rather than a pit of success because you now have to remember to add "and deleted_at is null" everywhere and forgetting to do so often doesn't look like a bug until the day some user sees something they thought had been deleted still appear.

remus · on Aug 17, 2024

Perhaps it's a scale thing, but I get more "oh crap, I just deleted this thing, can you get it back for me?" requests than I do bug reports about soft deleted records showing up incorrectly. Being able to restore records immediately is a big win imo.

treve · on Aug 17, 2024

Being able to contact a dev to undelete a record from a database does sound like a thing that's hard to scale. If this happens a lot, it could be worth examining if there's a UX problem.

sgarland · on Aug 17, 2024

So implement a trash can?

throw-the-towel · on Aug 18, 2024

And how would you implement one, if not with soft-delete?

sgarland · on Aug 18, 2024

No, it’s still a soft delete, but do it so restoration doesn’t require manual action by a dev.

josevalim · on Aug 17, 2024

Author here.

This can be addressed with PG views. You create a view where deleted_at IS NULL and now you query the view instead. Furthermore, if the view is “updatable” (which is the case when it only has filters), you can also do updates and deletes directly against it, and it all just works. So as long as your ORM/query builder can work with views, you are good to go.

I understand many devs want to avoid moving more logic to the database (I also personally avoid stored procedures and similar) but views are quite declarative and simpler to manage than indexes.

maiku2501 · on Aug 17, 2024

I haven't used row level security, but wouldn't it be easy to set the table up such that only a dedicated recovery role could select rows that have deleted_at set? Then the view wouldn't even be necessary.

Fire-Dragon-DoL · on Aug 17, 2024

Postgres supports rules for rewriting queries

setr · on Aug 17, 2024

too much magic for my taste. I think the more proper thing would be to simply use views, and expose the natural absurdity of the design with one-view-per-table.

cachehit · on Aug 17, 2024

Right. I have never tried those but they sound unappealing to me. I would expect edge cases where they don't work and unpredictable performance implications. But as I said, I have no experience so this is just my imagination.

gonzo41 · on Aug 17, 2024

IDK. I think people can overthink the problem a bit.

I done the is_deleted flag with a deleted_at timestamp that gets cascaded down the table relations. Restoring is an admin job and the logic is to follow the table relations down again restoring linked objects with the same deleted_at timestamp.

I have this in a metadata database that controls a data warehouse. Accidentally knocking out a metadata record could nuke large parts of the warehouse so i have to have this and a locked flag that prevents records from being removed or even soft deleted once data is in the warehouse.

namaria · on Aug 17, 2024

> I think people can overthink the problem a bit.

> I have this in a metadata database that controls a data warehouse. Accidentally knocking out a metadata record could nuke large parts of the warehouse

I think you might be underthinking your problem.

gonzo41 · on Aug 17, 2024

I don't have problems with that, we have effective controls. But the warehouse data requires context, and that's supplied via our metadata db. Once metadata starts having actual records associated with it you can't remove it, or you just end up storing junk.

hi-v-rocknroll · on Aug 17, 2024

Nullable deleted_at suffices. There is no need for an is_deleted flag.

hamandcheese · on Aug 17, 2024

Pretty much every soft-delete implementation I've seen involves modifying a tables schema to add a deleted_at column.

Invariably, this becomes a rather annoying thorn in my side when I need to do more advanced indexing, like unique indexes for upserts. Nothing impossible, but I'm quickly swimming upstream against whatever ORM I happen to be using (I'm looking at you, Prisma).

What I haven't seen much of is a trigger-based system which moves the deleted record into an entirely new table. Is it possible to make a deletions table that always mirrors the source tables schema? I know table inheritance is a thing but I don't know if that applies to future schema changes.

Anyways, I know deleted_at quite well, but I long for something better.

fiddlerwoaroof · on Aug 17, 2024

I wonder if this could be solved with partitioning on the deleted_at column; you’d only need two partitions: one for live data (deleted_at is null) and one for deleted data. Then, in Postgres at least, you could just query the live partition directly when you want to skip deleted records.

nine_k · on Aug 17, 2024

You can just build an index with a condition, without even.partitioning the table.

fiddlerwoaroof · on Aug 17, 2024

Except, the index doesn't help with avoiding the WHERE clause

dqv · on Aug 17, 2024

Yeah, but Ecto does. You can implement the Repo.prepare_query/3 callback to make sure the query has WHERE deleted_at IS NOT NULL. In fact, that's the exact example in the documentation:

https://hexdocs.pm/ecto/Ecto.Repo.html#c:prepare_query/3

fiddlerwoaroof · on Aug 17, 2024

Now I have two problems

dqv · on Aug 18, 2024

You don't have to reply to a comment if you don't really have anything else to say. What's the point of saying "Now I have two problems" without saying what the new problem is? So what is it? It can't be that an ORM has been added to the mix, since Ecto isn't an ORM.

hamandcheese · on Aug 20, 2024

The problem is you have to manage indexes in the database and keep that in sync with each application that queries it. Ecto is one of those applications.

hamandcheese · on Aug 17, 2024

This is a very interesting idea. I didn't know you could query a partition directly!

pikdum · on Aug 17, 2024

> What I haven't seen much of is a trigger-based system which moves the deleted record into an entirely new table.

I've used https://github.com/nearform/temporal_tables before for that, has been pretty solid. Still need to manually keep the schema in sync, though.

cachehit · on Aug 17, 2024

I like this (1). Like anything related to triggers, you need to be mindful of tables with heavy throughput and maybe move the logic to "code" so you can batch or skip things, but for most tables it works like a charm.

1. https://brandur.org/fragments/deleted-record-insert

josevalim · on Aug 17, 2024

Author here.

You are right, the main downside of the approach in the article is that you have to remember to filter the deleted_at column on constraints and consider how foreign keys cascade.

About triggers, we do link at the bottom to an article that uses them: https://brandur.org/fragments/deleted-record-insert

Since I am exploring a sync engine, having a separate table for deletes means I’d need to run additional queries when catching up, which led me to explore the single table approach. Let’s see how it evolves.

sgarland · on Aug 17, 2024

> trigger-based system

You read my mind. Yes, this is possible. I haven’t done it in prod, but I made a PoC. As to table inheritance, yes, you can also do that, and yes, in PG anyway it automatically propagates schema changes.

There are quite a few gotchas [0], though, so think carefully before doing so.

The below assumes you have a user/customer table named customer, and another copy named inactive_customer. One other note: if you use an IDENTITY for the PK in the main table, you’ll want to drop that first in the inactive_customer table, so the ids match. Example below.

    ALTER TABLE inactive_customer ALTER COLUMN id DROP IDENTITY;

    CREATE FUNCTION move_inactive_customer()
    RETURNS TRIGGER
    LANGUAGE plpgsql
    AS $function$
    BEGIN
        INSERT INTO inactive_customer(<schema>) —-replace schema with yours
        VALUES (OLD.<col> for col in cols) —-replace this pseudo-loop with your schema
        ON CONFLICT (id) DO NOTHING —-modify for your table constraints
        RETURN OLD;
    END;
    $function$;

    CREATE TRIGGER redirect_customer
    BEFORE DELETE
    ON customer
    FOR EACH ROW
    EXECUTE PROCEDURE move_inactive_customer();

[0]: https://www.postgresql.org/docs/current/ddl-inherit.html

out_of_protocol · on Aug 17, 2024

> unique indexes for upserts

Would simple "create unique index on .... where deleted_at is null" work for you?

MarceColl · on Aug 17, 2024

I mean, Prisma is just awful, that's hardly a problem of soft deletes.

At my previous job we used prisma, and what a terrible system, the amount of problems I saw there. Some have been fixed since, but it shouldnt take 3 major versions of an ORM to use an actual upsert operation instead of doing

if select() then update() else insert() at the client side, which is all sorts of wrong. Or the 50ms of latency it introduced per query by starting a separate process that generated the queries, or the lack of schema flexibility, or the idiotic new join features that transforms rows into jsons in a destructive way.

It's just bad software

mnahkies · on Aug 17, 2024

I've hit a lot of rough edges with it too. A few off the top of my head:

- in memory joins (until they added the preview relationalJoin feature)

- no partial indexes

- no select for update

- no table partitioning (though if you do this out of band it all works fine)

- use of create type enum instead of check constraints makes maintenance of enumerated columns more challenging (IMO)

- little support for less common (but useful!) data types like geospatial, intervals, ltree, etc

- the reliance on wasm/rust makes debugging what it's doing difficult

- the connection pooling defaults are surprising (never reaps idle connections, AFAIK not cgroups aware and over provisions the pool size based on host CPU count)

For simple use cases it's appealing, and I suspect it's probably a better experience with nosql databases, but when you're used to taking full advantage of postgres' feature set it gets frustrating quickly.

There's issues in the tracker for basically everything above, and some may have even been solved by now - in a few more years it might be a better experience.

sgarland · on Aug 17, 2024

> in memory joins (until they added the preview relationalJoin feature)

This, to me, was the single issue that made me forever not take Prisma seriously. I get that “ship now, fix later” is a mindset, but you simply cannot ship an RDBMS wrapper that doesn’t handle JOINs. That’s an integral part of the system.

> the connection pooling defaults are surprising (never reaps idle connections, AFAIK not cgroups aware and over provisions the pool size based on host CPU count)

Tbf I have yet to find a piece of software that hasn’t messed up cgroupsv1. That’s not really an excuse – read docs – but at least everyone seems to make the same mistake.

atonse · on Aug 17, 2024

What about foreign keys referencing the row you moved?

nine_k · on Aug 17, 2024

I suppose "on delete cascade" would fire appropriate triggers in the downstream tables, and "on delete set null" will take care of upstream and peer tables.

If this is not sufficient, and a live record in one table may legitimately refer to a "dead" record in another table, then well, maybe "soft delete" is what you need, and your data just has a deactivated state.

samanator · on Aug 17, 2024

Yes it can be done with rules, inheritance, triggers, and updatable views.

See my answer on stack overflow

https://stackoverflow.com/a/78694335/4188138

However, in my experience, it's more easily maintainable to use normal deletes, have a trigger to move it to a schemaless table (jsonb column type that contains all of the data at the time of delete), and have manual restore logic via a good UX.

Fire-Dragon-DoL · on Aug 17, 2024

Couldn't you apply all the indices as partial where deleted at is null?

code_biologist · on Aug 17, 2024

For Django, django-simple-history [1] is killer at this. It doesn't track perfectly in the same way as temporal tables (if a DB operation doesn't trigger a save hook, it won't be tracked) but the ergonomics for Django work is pretty unbeatable. Great integration into the ORM + auth + admin. History tables automatically derived from model definitions with sane nullability defaults, sane migration strategy, and hackability where needed.

It's made "can we get an audit table / undo delete for XYZ?" feature requests really easy.

[1] https://github.com/jazzband/django-simple-history

NicoJuicy · on Aug 17, 2024

Uh. Why? Deleted at is the last ModifiedOn timestamp.

Nothing should modify it anymore after that.

vbilopav · on Aug 17, 2024

Please don't do this. It's a design/ architectural sin. It messes up your queries, commands and indexes and it's constant source of trouble.

Here's a better way: https://medium.com/@vbilopav/custom-temporal-tables-in-postg...

dqv · on Aug 17, 2024

What does it do when a table that already has that trigger on it gets altered (i.e. a new column is added)?

vbilopav · on Aug 17, 2024

It's already built into code to check does trigger exists, check it out.

As far as changes(alter table ) goes, I need to take history tables into consideration when writing migration scripts. Since I'm doing my migrations always with SQL scripts that not an issue.

However, there's a concept of system event triggers in PostgreSQL that can intercept alter table statements, so I'm thinking to expand my solution to alter history tables automatically. Check it out: https://www.postgresql.org/docs/current/event-triggers.html

adontz · on Aug 17, 2024

While I appreciate all criticism of this approach in the comments, it's not that simple.

1. There may be audit policies, compliance policies, retention policies in place. You may HAVE to keep data which is deleted from user's perspective.

2. People make mistakes. While I agree that restoring a long ago soft-deleted object may be very tricky because of referential integrity. It's easy to restore a recently (hours, days) soft-deleted object. Not only easy, it's much faster than restoring a point-in-time backup of an entire database. Also restored version is guaranteed to be the latest one, unlike in backup.

3. PostgreSQL supports partial indexes and partial unique constraints. https://www.postgresql.org/docs/current/indexes-partial.html

Other databases support similar features too. That allows excluding soft-deleted records from indexes, makes indexes smaller. A good query does not scan a table anyway. If one needs to read less pages from a table, PostgreSQL has CLUSTER command. If there are too many soft-deleted tuples, maybe it's time to think about an archive table anyway.

Unique constraints may ignore soft-deleted objects, so only active objects need to be unique. That is very useful if you need to support idempotent calls. Additionally that may actually be a security feature too. If I create a new user with the same username as an old deleted user, I know that username is unique, but also, that I will not accidentally get permissions of the old user assigned.

4. The worst is tooling. Forgetting "WHERE deleted_at IS NOT NULL" or "WHERE NOT is_deleted" is a very real problem. I would rather advise against soft deleted objects because of that problem alone.

On the other hand, if you use a decent ORM, (and why would not you?) like SQLAlchemy, Django ORM or EntityFramework/Linq (sorry for guys who are not Python/.Net, I do not have a good example for you), then it's quite easy to create a default model query which is not empty. But honestly, if your system is complex enough and has any custom row level security, if it is multi-tenant, if not all users see everything, you already have quite similar problem, you already have to start with non-empty query which already has WHERE and possibly even JOIN clauses. Without a good ORM it will suck anyway.

b-man · on Aug 19, 2024

Just use temporal tables. It would cover all the cases you brought up.

adontz · on Aug 19, 2024

This thing? https://learn.microsoft.com/en-us/sql/relational-databases/t...

Are not we talking about PostgreSQL?

b-man · on Aug 19, 2024

Temporal tables are an implementation of one of SQL 2011's main features: system time.

https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs...

Postgres itself has not yet added such to the core, since it moves about as fast as an elephant. There are extensions that do implement it (https://wiki.postgresql.org/wiki/Temporal_Extensions).

adontz · on Aug 19, 2024

Help me here please. I really do not see how

WHERE CURRENT_TIMESTAMP BETWEEN start AND end is much better than WHERE deleted_at IS NOT NULL

Also, if we are talking about real use case of audit, not a simplified artificial one, my real tables were more like

changeset(id INT, timestamp DATETIME, user ^USER, host TEXT, ip_address TEXT, <other security data>)

datatable1(id INT, created_at ^changeset, updated_at ^changeset, deleted_at ^changeset NULL, <other fields>)

datatable2(id INT, created_at ^changeset, updated_at ^changeset, deleted_at ^changeset NULL, <other fields>)

datatable3(id INT, created_at ^changeset, updated_at ^changeset, deleted_at ^changeset NULL, <other fields>)

b-man · on Aug 20, 2024

It is better, as others have written, because it preserves integrity constraints.

hu3 · on Aug 17, 2024

Anyone has experience with Softdeletes? Any links?

I've worked both ways. Hard DELETE and soft UPDATE deleted_at = NOW().

They both have tradeoffs.

It seems to me that delete_at columns allow for faster development at the cost of more runtime bugs. Like why is this deleted row being listed in this <select>? Did I forget to add `WHERE deleted_at IS NULL` again?

Meanwhile a hard DELETE usually means I get to copy rows to a cemetery table just in case. And it forces me to think about what to do with Foreign Keys that point to the row being deleted. Can I really delete this row if 3 other tables point to it? It makes some business rules evident.

I'm thinking, perhaps I can serialize hard DELETED rows as JSON and store all system deleted rows in a trash table. This way I don't have to keep schema in sync there too.

Izkata · on Aug 17, 2024

> Like why is this deleted row being listed in this <select>? Did I forget to add `WHERE deleted_at IS NULL` again?

This is adding more complexity but... views?

For example: rename the "foo" table to "all_foo" and "CREATE VIEW foo AS SELECT * FROM all_foo WHERE deleted_at IS NULL"

I think you could even create a trigger on the view so "DELETE FROM foo" gets replaced with an "UPDATE all_foo SET deleted_at = NOW()", but I've never tried that before.

hu3 · on Aug 17, 2024

Not bad. I forgot about the view solution. So thanks for remembering me.

I haven't worked with this approach. But I might give it a try.

KronisLV · on Aug 17, 2024

> Did I forget to add `WHERE deleted_at IS NULL` again?

This seems like an unaddressed use case on the part of the RDBMS to me, a feature that shouldn’t be reinvented by ORMs or each individual project.

Instead, you should be able to just do DELETE SOFT FROM … in your queries and have the DB make the rows unavailable in all queries.

Or maybe even set it up, so that regular DELETE is treated like a soft delete, so that suddenly all of the ORMs and users wouldn’t be able to drop data on accident.

Maybe allow overwriting it with DELETE /*+DELETE_HARD*/ FROM … if you so choose, as well as do SELECT /*+WITH_SOFT_DELETED*/ … if you don’t want to introduce any historical incompatibility.

If such features don’t exist at the DB or a DB extension level, then we’ll be doomed to deal with ORMs reimplementing it which will be pretty bad if you ever need more than one tech stack connected to it, alongside the issues of all of your queries needing additional conditions.

hu3 · on Aug 17, 2024

I agree, DBs need to solve this better.

Temporal tables seems like a step in the right direction but they too add complexity and gotchas. And I think they are kind of overkill for just softdeletes.

Another low level solution are Views. But I have databases approaching a thousand tables. Duplicating 1000 tables as views would add significant friction to development and maintenance.

setr · on Aug 17, 2024

> I'm thinking, perhaps I can serialize hard DELETED rows as JSON and store all system deleted rows in a trash table. This way I don't have to keep schema in sync there too.

this is normally known as an audit table.

The problem with all of these methods though is that data is persisted, but the strategy to actually recover it is left as an exercise for the reader. And in my experience, never implemented and data is never actually reviewed/recovered in practice. The scenarios where a "recycling bin" is actually needed usually ends up encoded as a user-facing business requirement that ends up having an actually well-defined, non-generic, and actually recoverable logic that has nothing to do with the generically applicable soft/hard/audit strategies.

hu3 · on Aug 17, 2024

I agree it's very rare, if ever, that business folks require to undelete data.

But I have to say, with the developer hat on, softdeleted rows helped me debug systems more than once. With that said, softdeleted rows also wasted my time more than I'd like to admit while exploring data only to realize I was looking at stale, deleted data.

setr · on Aug 19, 2024

an audit table is useful for that purpose. soft delete is product of the confused mind, thinking one can get data recovery for almost-free. Audit tables admit that at best it'll be a manual process to be custom-designed for the problem at hand, when it comes to it

zero_kool · on Aug 17, 2024

I'm currently in the process of migrating back to hard-delete from soft-delete on my current company's codebase. We use a slightly different approach of using triggers to do cascade deletes. It's just very difficult to maintain as the product grows bigger. All those additional overhead to create additional rules for every table and the scope for human error. Overall, I dont recommend anyone take the route of soft-deletes unless it's absolutely required.

bennyp101 · on Aug 17, 2024

Not a fan of soft deleting, but as I sit here I wonder if using an RLS policy to limit to anything without a deleted field would make the whole “forgetting a where” a bit easier? I’m not sure the performance would be great though

b-man · on Aug 19, 2024

This is a rehash of all the things explained in the paper http://www.bailis.org/papers/feral-sigmod2015.pdf

TLDR: Not a good idea.