I work in one of large tech companies, and I can attest that while the idea seem...

bob1029 · 2025-03-14T10:25:56 1741947956

> sooner or later people realize that they need to dynamically adjust parts of the pipeline

The customer is the hard part in all of this, but there is respite if you are patient and careful with the tech.

If you are in a situation where you need to go from one SQL database to another SQL database, the # of additional tools required should be zero. Using a merge statement & recursive CTEs per target table, you can transform any schema into any other. Most or all of the actual business logic can reside in the command text - how we filter & project data into the target system.

If we accept the SQL-to-SQL case has a good general solution, I would then ask if it is possible to refactor all problems such that they wind up with this shape in the middle. All of that nasty systems code could then be focused more on loading and extracting data into and out of this regime where it can be trivially sliced & diced. Once you have something in Postgres or SQL Server, you are at the top of the hill. Everything adapts to you at that point. Talking to another instance of yourself - or something that looks & talks like you - is trivial.

The other advantage with this path is that refactoring SQL scripts is something the customer (B2B) can directly manage in many situations. The entire pipeline can live in a single text file that you throw around an email chain. You don't have to teach them things like python, yaml or source control.

bbminner · 2025-03-16T19:54:30 1742154870

In fact, I also converge to sql as a universal data transformation language. External analogs include things like duck db. Unfortunately, even with pipe syntax sql lacks expressiveness causing me to revert to c-style macros in sql (eg making table name dynamic), which in the long run makes things far less maintainable if anything.

lucyjojo · 2025-03-14T06:29:20 1741933760

yeah, most projects when you spot a config file, its complexity will tend to scale with the increasing complexity of the domain you capture.

so either it's very small/mature and you don't have to worry too much, or in the active development case your config files are pretty much the instruction set of some kind of logical foggy vm... and eventually a whole environment of tools etc. will "compile down" to your config files and you get a pain knot to endlessly massage...

chenquan · 2025-03-14T05:36:09 1741930569

Thank you for your valuable experience, I will seriously think about what you said.

NeutralForest · 2025-03-14T09:40:21 1741945221

Pretty much my take any time I see all the convoluted Bicep and YAML we have since there's a bunch of conditional logic and more in our pipelines.

narad_muni · 2025-03-20T07:54:21 1742457261

Have been building same thing in rust, The part for processing data is quite complex, and I agree that coding in python is a better way

_ink_ · 2025-03-14T08:35:23 1741941323

So far, this was exactly my experience as well. Well said.