In the past year or so I have thought a lot about how to design tables and colum...

jandrewrogers · 2025-04-27T18:08:32 1745777312

The tradeoff is that this is very expensive at the scale of large geospatial data models both in terms of performance and storage. In practice, it is much more common to just take regular snapshots of the database. If you want to go back in time, you have to spin-up an old snapshot of the database model.

A less obvious issue is that to make this work well, you need to do time interval intersection searches/joins at scale. There is a dearth of scalable data structures and algorithms for this in databases.

nradov · 2025-04-27T21:42:46 1745790166

Anyone who works with human names should take a look at the HL7 V3 and FHIR data models, which were designed for healthcare. They support name validity ranges, and a bunch of other related metadata. It can be challenging to efficiently represent those abstract data models in a traditional traditional database because with a fully normalized schema you end up needing a lot of joins.

pavel_lishin · 2025-04-27T16:57:25 1745773045

If you have an "audit" table, where you write a copy of the data before updating it in the primary table, that's a decision you can make at any point.

Of course, you don't get that historical data, but you do get it going forward from there.

homebrewer · 2025-04-27T19:02:42 1745780562

SQL 2011 defines temporal tables, which few FOSS databases support. I used it in mariadb:

https://mariadb.com/kb/en/temporal-tables/

and if your schema doesn't change much, it's practically free to implement, much easier and simpler than copypasting audit tables, or relying on codegen to do the same.

tough · 2025-04-27T17:10:54 1745773854

something like https://www.pgaudit.org/ ?

Basically you keep an history of all changes so you can always roll-back / get that data if needed?

pavel_lishin · 2025-04-27T17:34:34 1745775274

The last time we did this, we basically hand-rolled our own, with a database trigger to insert data into a different table whenever an `UPDATE` statement happened.

But this seems like it's probably a better solution.

tough · 2025-04-27T17:37:57 1745775477

never had used pgaudit yet to vouch for it but have it on the backburner/log of things to try for such a use case!

I think the real magic is it lleverages the WAL (write ahead logs) from pg engine itself, which you could certainly hook up into too, but im not a db expert here

tough · 2025-04-29T16:30:49 1745944249

I just found out about bemi dot io, seems like they're targeting this issue

boramalper · 2025-04-27T17:30:49 1745775049

See also "Eventual Business Consistency"[0] by Kent Beck. Really good read.

> Double-dated data—we tag each bit of business data with 2 dates:

> * The date on which the data changed out in the real world, the effective date.

> * The date on which the system found out about the change, the posting date.

> Using effective & posting dates together we can record all the strange twists & turns of feeding data into a system.

[0] https://tidyfirst.substack.com/p/eventual-business-consisten...

atoav · 2025-04-28T05:14:49 1745817289

Thanks for posting this, I read tbis a while ago, but it is worth revisiting.

taeric · 2025-04-28T15:19:12 1745853552

One that routinely surprises me is that this is not easy to do in any popular contacts application. I actually would like to keep every address I've ever had in there, in case I need to remember it for some reason later. Maybe I just want to reminisce. I don't want to accidentally have it as an "active" one, though.