The problem with modern development is having to nail down the data model first....

indymike · on Jan 8, 2024

> The problem with modern development is having to nail down the data model first.

Schemaless was one of the original drivers for NoSQL databases.

Now, when I need something schemaless, I start with a Postgres table with an ID and a jsonb or json field... which at least makes it easy to have a schema when the inevitable happens and schema-dependent code ends up getting added to the project.

> To do this every data dependency in the system needs to traceable.

This is a hard problem.

sgarland · on Jan 8, 2024

Admittedly, yes. This is the massive appeal of Mongo et al., or just JSON[B] columns in an RDBMS.

Unfortunately, at a very deep level, that’s simply not how RDBMS works. The tuples are a B+tree, and in some (MySQL [InnoDB], SQL Server) cases everything is clustered around the PK. If you don’t create a data model that’s easily exploitable for optimizations designed around that data structure, you’re gonna have a bad time. It’s no different than if you decided to use strings to store ints – you _can_, but it’s a bad idea for a variety of reasons.

What you can do is give yourself as much leeway as possible, by following some basic best practices. For example, it’s a hell of a lot easier to update a tiny reference table than to update billions of rows when you decide that column `region` should say `European Union` instead of `EU`.

datastack · on Jan 8, 2024

Nosql doesn't solve the schema migration problem. It just means you don't formalize your schema. But your code will implicitly require a certain schema anyway. Changing the schema means changing the code and migrating data. You'll have to write migration scripts and think about backward compatibility. Same problems as in sql.

valty · on Jan 8, 2024

The trick is maintaining a full graph of all data dependencies through the entire codebase. Then migrations can be done with ease. But no one does this. They shovel data from one database to the next, with tons of little adhoc data stores along the way.

valty · on Jan 8, 2024

Yeh RDBMS is probably the wrong choice for most apps. It was good for crunching sales data in batches back in the day. Everything today is pipelines and reactivity.

My dream is to have a tool to model my logical data model and then it will organize my data into the best storage and caches.

I don't think any existing database today is useful.

sgarland · on Jan 8, 2024

Ah, I misunderstood your point. I disagree that RDBMS is the wrong choice. Most apps are CRUD, and have the same basic patterns.

vrighter · on Jan 8, 2024

I'd rather the data model be designed properly upfront so that it doesn't need to change, but can be extended with new functionality.

valty · on Jan 11, 2024

This is impossible.