> *The entire thing is stitched together by spreadsheets that are parsed by Pyth...

baz00 · on Oct 30, 2023

Everything I look at these days looks like this. And most of the time it doesn't even solve the initial problem statement but everyone is too naive to even realise that.

The worst thing I've seen is a stack that parses out a file and loads it into a DB. So someone sends us a file via an expensive SFTP+S3 thing in AWS. That is then picked up by some scheduled task using a proprietary in house scheduler process running inside kubernetes. This proceeds to download the file to the local pod. Then it makes tens of thousands of API calls to match up data which cranks the CPU up on a huge database server. This breaks all the other jobs running. Then it writes another file out to S3, consuming 17GB of RAM in the process. Another process picks that up and then batches it and inserts it into the DB with no transactional stuff around it.

The original process this replaced was a copy into a temporary table and then a bit of transaction-wrapped SQL that took about 20 seconds to import + run. They improved that to 7 hours and reduced the success rate from 100% to about 80%

dmd · on Oct 30, 2023

I am currently working with a US government system for downloading public scientific data. You select some data you want to download and add it to a shopping cart. Check out, and select 'create database'. This generates your own copy of an Oracle database, with your own credentials and hostname and db name. Connect to that and construct a query against a table that has some metadata about studies you're interested in. Using the identifiers from that table, join with a LIKE against another table for s3:// URLs. (There are no primary keys and the other table's column is not exactly the same; you need to use a LIKE. This is all documented.) Those s3 URLs point to a CSV which contains another identifier which you use to download manifests which contains links to a web page created on-the-fly which contains to the s3 files to download. By the time you've done all this, your access has likely expired and you must start over from scratch.

blantonl · on Oct 30, 2023

I'm going to take a deep breath, and move on to the other stories further down.

So far I'm feeling pretty good about what I've developed over the years.

dmd · on Oct 30, 2023

(I assume this is all a reflection of https://en.wikipedia.org/wiki/Conway%27s_law )

alliao · on Oct 30, 2023

voted for the Kafkaesque of 2023

foobiekr · on Oct 30, 2023

I know of an engineer who built a work queue by having a chain from an app to kafka to a processor to Kafka to a database writer.

Literally instead of a table.

This stuff is everywhere. Microservices made it worse and half legitimized it.

KMag · on Oct 30, 2023

But, going the other way, I worked for over a decade on Goldman Sach's SecDB system. It's a quirky steampunk alternative future that branched from our light cone around 1995. There's a globally distributed eventually consistent NoSQL database tightly integrated with a data-flow gradually-typed scripting language (and a very 1990s feel 16 color IDE). I'm sure in the late 1990s/early 2000s (before globally distributed NoSQL was popular and before gradual/dynamic typing had a resurgence) it was more like discovered alien technology than steampunk alternative future. (Also, with source code being executed from globally distributed immutable database snapshots, deployment is much nicer than anything else I've used to date. After release testing, set a database key to point to the latest snapshot, and you're deployed.)

There's a service that watches the transaction log of your regional replica so that you can make long-poll HTTP requests that return when any change matching your filter is committed. (Edit: usually the HTTP result handler is used to invalidate specific memoized results in the data flow graph, letting lazy re-evaluation re-fetch the database records as needed.)

It makes a lot of sense for a financial risk system, where you end up calculating millions of slight variations on a scenario. The data flow model with aggressive memoization makes this sort of thing much cheaper.

However, I saw plenty of systems written where you'd attempt to write your request to the next key matching some regex (and retry with the next key if it already existed), where your request would contain some parameters and the database key and/or filesystem path where results should be written.

Under-experience with databases easily results in rewriting a database using message queue/bus. Under-experience with message queues/busses easily results in rewriting a message queue/bus using a database.

kwillets · on Oct 31, 2023

Message queueing triggers a host of psychological needs. Synchronous jobs rarely need monitoring and management features, but move the same work to a queue, and everybody loses their minds.

Groxx · on Oct 30, 2023

I've seen so many people spend weeks if not months "working" to avoid doing a trivial database migration. Database fear is overwhelmingly powerful in a lot of people it seems.

znpy · on Oct 30, 2023

> Database fear is overwhelmingly powerful in a lot of people it seems.

Database are still fairly poorly documented when it comes to administrative work.

There is an incredible amount of tutorial, books and courses on how to write sql queries and stuff... But there is almost zero content on how to properly administer a database.

I mean, from novice admin to DBA-level capabilities.

I said all this before and i'm ready to write this again: i think there's a good market space for dba-style courses.

foobiekr · on Oct 31, 2023

This does not honestly seem true for the main sql databases. For everything else, yes, but if people were actually learning databases instead of hiding from them most of the uptake there wouldn’t exist anyway.

ljm · on Oct 30, 2023

I think I've seen enough complexity created by engineering teams given total autonomy, with hands-off leadership, that I'd prefer a much more constrained approach. There should still be autonomy, of course, but proposals for new tech, languages and paradigms should only be considered with due diligence.

The most unpleasant codebases I've dealt with are ones that have suffered from a lack of strong leadership, and they are almost uniquely microservice setups that pull in everything but the kitchen sink, usually because it's just trendy to use it. Monoliths can get pretty damn ugly too but at least it's contained in one single codebase.

JohnMakin · on Oct 30, 2023

This must be common or we worked at the same company, seen this exact pattern.

It went like:

App -> DynamoDB -> Kafka Connect Sink Process -> RDS -> Kafka

The reason for all the middle processes were because teams couldn't agree how to structure their data and the first app would dump literal nonsense sometimes so the Kafka connect process's job was to clean it and dump any of the nonsense they pumped into it. Pretty sure there was a gnarly log aggregation layer in the middle somewhere too IIRC.

specialist · on Oct 30, 2023

Repeating myself...

Just two examples from my prior gig (fashion e-commerce).

#1 Our hottest dataset (db of current products) stored in DynamoDB. Core dependency for all our code. Easily fits in < 1Gb of RAM. OMG, just make a hashmap. Over a year, I managed to persuade the team to start transition from DynamoDB to Redis.

#2 Tiny (vs micro) service that munged some URLs. Blocker for an important campaign. Prior team of 4 churned for a year, was no closer to delivery. Spring, ORMs, CI/CD pipelines, the works. I spent a week unraveling the requirements (repeated facepalm). A second week banging out a trivial nodejs thing. (My team preferred nodejs, which was their prerogative.) Really trivial. I felt so bad for the biz dev people who'd been dying to get this functionality for so long.

oooyay · on Oct 30, 2023

You can actually make a very comfortable career of Senior and Staff by learning to identify this kind of work/system and proposing ways to simplify it. These kinds of systems, as the author pointed out, are incredibly expensive and inefficient, but look readable on an architecture diagram.

mring33621 · on Oct 30, 2023

As opposed to all those people that make similarly comfortable careers in middle and upper mgmt by identifying simple systems and complicating them beyond recognition?

oooyay · on Oct 30, 2023

Hah, yes. I will say that while I understand the general disdain here, as I grew more senior in my career I realized the world takes all types. There are "doers" who will rush to an end goal that's highly prioritized and then there's "optimizers" who come fix that mess up into a durable, cost-effective system. Some people are gifted enough in knowledge and have the right business priority to do both at the same time, but usually they're required at different times.

Anecdotally, optimization tasks (in this brain) are multitudes easier than innovation tasks. I spend a lot of time thinking about how to do things differently whereas optimization utilizes many lessons I've learned over and over again with well-trodden patterns. That's to say, I'm grateful for the doers :)

solarkraft · on Oct 30, 2023

These roles aren't opposed at all, they greatly benefit each other :-)

Bringing what used to be the privilege of upper management (wasting massive amounts of resources while getting paid handsomely) down to software developers.

It's that trickle-down effect people talked about, right?

joshuahutt · on Oct 30, 2023

It's a beautiful symbiosis.

blitzar · on Oct 30, 2023

> learning to identify this kind of work/system and proposing ways to simplify it

"I dont think you are fitting in here at MegaCorp"

ljm · on Oct 30, 2023

"We need this to scale to hundreds of millions of users across many regions"

"But we have no users at all right now"

"But we might have hundreds of millions of users in future"

ludicity · on Oct 30, 2023

I am the author and this has caused me psychic damage.

mateo411 · on Oct 30, 2023

You should take a vacation.

avgDev · on Oct 30, 2023

"You are being negative, the system is great, we don't like that kind of attitude here".

nine_zeros · on Oct 30, 2023

> "You are being negative, the system is great, we don't like that kind of attitude here".

I want to see more lines of code, not less.

blitzar · on Oct 31, 2023

Hmm looking at your git statistics it appears you have only pushed 60 commits this month with 12,000 lines of code changed - while Jimmy over here has pushed 200 commits this month with 200 lines of code changed.

If you do not improve next month we are going to have to let you go, we just cant carry a 0.3x employee such as yourself.

lainga · on Oct 30, 2023

> You can actually make a very comfortable career of Senior and Staff by learning to identify this kind of work/system and proposing ways to simplify it.

Where?

oooyay · on Oct 30, 2023

I've typically worked in SRE and platform engineering work and that's where I've gotten exposed to these kinds of Rube Goldberg machines. Make a short list of them when you find them and then use them as a hit list during "cost cutting". Most people don't want to touch these systems because they look big and expansive and generally "work". They're just very poorly optimized.

Dare I say, any time I see a function as a service my brain immediately drifts to inspecting the cost implications of said process.

tomrod · on Oct 30, 2023

IT departments, typically, though occasionally there are whole companies that work in "technology" where this type of work can be found.

I said the above as a jest, but seriously, simplification of complex stacks has been a good consulting gig.

foobiekr · on Oct 30, 2023

Most large companies. There is a stark difference between the distinguished engineers and the tier below them in terms of asking people to stop doing things badly.

baz00 · on Oct 30, 2023

Tried it. Nope. You can get people to acknowledge it but because it's not a fun project or doesn't involve an upsell you can bill the clients for, it'll go in a product backlog for a decade or two.

I don't care any more. I'm just there to tell people what's shit and then laugh when it explodes in their face.

pastage · on Oct 30, 2023

That is what skunkworks are for. You just deliver on time and you are fine.

baz00 · on Oct 30, 2023

What is this deliver thing? I haven't done anything productive for years.

claytonjy · on Oct 30, 2023

The easy part is choosing a better end-state; anyone can do that, and for any of these Rube Goldberg machines at a large-ish company, several people likely have.

What makes someone a staff+ is finding a path to iteratively evolving towards that end-state without breaking anything along the way and while having progress to show off at each step.

pastage · on Nov 4, 2023

Oh yes but part of that is knowing that you need to get others on board. Things get messy when no one knows the end goals with changes.

latexr · on Oct 30, 2023

That reads like the KRAZAM Microservices sketch.

https://www.youtube.com/watch?v=y8OnoxKotPQ

Lacerda69 · on Oct 30, 2023

KRAZAM is a prophet and must be protected at all cost

neilv · on Oct 30, 2023

I've used that video to explain to business people. It's watchable, and communicates important ideas of what a poopshow this can easily become, without having to talk about real partners/teams close to home as problems.

ludicity · on Oct 30, 2023

At my workplace (the one in the post), whenever one of the good engineers asks about how something works and it's one of these spaghetti-balls, we chorus "It's the design of our backend, okay?"

foobiekr · on Oct 30, 2023

I see stuff like this every day. It is a natural consequence of people who only “develop” by gluing things together. God help them if they’d actually have to write some core function themselves.

pphysch · on Oct 30, 2023

The worst kind of "DevOps engineer" that doesn't really understand development, operations, or engineering.

But hey, they can run some docker and git commands and piddle around an AWS GUI, which means they are highly technical.

candiodari · on Oct 30, 2023

You don't give enough credit to organization chart and project driven engineering.

When developing anything:

1) you don't get to touch anyone else's code. And another department's code? Something another manager's team manages ... that amounts to treason. Never for any reason. MAYBE if they've totally abandoned it and you absolutely need it (but only during unpaid overtime)

2) you don't get to spend ANY time on anything outside of the current project or JIRA ticket. Any time at all. So really, NOT optimizing anything is faster and cheaper. Just look at all the spreadsheets made!

wredue · on Oct 30, 2023

>piddle around in an AWS GUI

I’ve had enough calls with the “Senior/Technical Lead Azure Cloud Engineers” telling them exactly what they need to do that me and them really really don’t get along.

I don’t do any of that shit and even I can muddy my way through it, but these people cannot. The real kicker of it is how much these people make.

And you know how there are those people who, every time you need to work with them, they answer a teams call and then “need to get to my computer, give me 5” and their status is perpetually set to away? I don’t want to RTO at all, but dealing with this team almost makes me think I’m wrong about that.

vsareto · on Oct 30, 2023

>It is a natural consequence of people who only “develop” by gluing things together. God help them if they’d actually have to write some core function themselves.

That's on the industry for not training and gating well. It would be nice to have glue/plumber positions so expectations are not out of line too.

foobiekr · on Oct 31, 2023

I don’t think most of the coders who end up in glue code positions are actually trainable.

Agree solidly on the gating aspect though.

The problem is that quality hires continue to be rare and at some level you are doing area-under-the-curve reasoning.

Vicinity9635 · on Oct 30, 2023

I feel like this might be case of data engineers.

They're not usually software engineers. They're tool users not tool makers.

So they'll cobble things together to accomplish the task, using only available tools and never anything custom that would do it task much more cleanly, because they understand data, not software. They're not computer scientists or programmers, they're just users. And we all know what that means.

entropicdrifter · on Oct 30, 2023

Agreed. I've been "the backend engineer who works with the data engineers" for several years now and I've seen their general trend of re-inventing the wheel the hard way a number of times.

I've spent the majority of my career building better tools for data-related tasks, then winning over my users by showing off performance and productivity gains.

scruple · on Oct 30, 2023

I stepped into a Data Engineering Lead role in 2019. Stepped out of it in 2021. My team was the first in the org to really approach data engineering and we were all software engineers. I'm told that the systems we built have largely been replaced by Rube Goldberg machines pieced together by the folks who came after us.

Those replacement systems aren't even working, they're failing to deliver on the same simple data pipelines that we had working by the start of 2020. They're cobbled together using a million little AWS pieces and Docker and k8s... I'm glad that I left that role when I did, we were being pushed by a new-hire with a fancy Data Engineering VP title to do all sorts of asinine things. I went and looked just now and I see that he's Senior VP at a different company, he started there this summer. Onward and upward!

gruturo · on Oct 30, 2023

:O

And I thought my unholy xmllint -xpath (bad stuff, lots of slashes) ${1} |sed -r -e s/this/that/ -e s/alsothis/alsothat/ -e /ohyeahthistoo/somethingelse/ | grep something | while read AA; do stuff then echo ${COUNTRY},${SIGN}$(perl -e "printf('%.2f', ${VAR}/1000000)"),${ENTRYDATE}; done|sort

was as bad as things get. I need to get my horror code game up. I mean, not only is the code awful, its very purpose is horrifying (XML to CSV with some transformations, bit of math, all without being able to use any external sources due to security, only what's in a baseline RHEL7 (soon 8, yay!) ).

I promise I'll rewrite it in python at some point.

crooked-v · on Oct 30, 2023

The more experience I have, the more I start to think the Omnigres people are right about "just put literally everything into Postgres".

syntaxing · on Oct 30, 2023

I’m genuinely curious what a unit test for something would look like.

yonixw · on Oct 30, 2023

// TODO

ak39 · on Oct 31, 2023

LOL. What did this comment (no pun!) do for your HN karma points?

FartyMcFarter · on Oct 30, 2023

That would be an integration test, not a unit test.

jacquesm · on Oct 30, 2023

Checksum on the resulting csv with a parallel implementation of the whole pipeline ;)

syntaxing · on Oct 30, 2023

I don’t work with large databases so pardon my ignorance. Is there typically a “unit test” bucket you run it on or do you just put in test entries on a production bucket?

jacquesm · on Oct 30, 2023

Normally you'd fire up a separate environment, mock the process and see if it produces the expected results. By the time you put 'test entries in a production bucket' there are so many lines crossed that it likely won't end well even if the tests do pass.

hightrix · on Oct 30, 2023

We tend to only test what is being tested. So, most DB calls are mocked in our unit tests. For stored procs or other tests that need to be run on a DB, we use a test DB that is setup to mirror production.

I'd bet there are a 100 different answers to your question though. This is the way we handle it.

tommek4077 · on Oct 30, 2023

That is probably quite straight forward and of course they have 100% coverage.

wredue · on Oct 30, 2023

We usually use very complicated UIPath flows to “test” these things.

If that doesn’t exist yet (it usually doesn’t), we test manually, but only core workflows.

toasted-subs · on Oct 30, 2023

You have no idea how unbelievably annoying it is to work in a company that doesn't a well defined architecture. Every "buzz word" service should be easily justified.

This is why I hate recruiters, I can't even tell you how many times I've had a recruiter call me saying they are looking for service XYZ. The same concept rephrased in my resume. I have to rewrite my resume just to satisfy these people? No thanks.

recfab · on Oct 30, 2023

I had recruiter pull that in my most recent job search. Has to stick "C#/..." in front of everything because they didn't understand that ASP.NET, WPF, WCF, WinForms and several other C#-specific tech had anything to do with .NET.

topaz0 · on Oct 30, 2023

Of course the answers on stack overflow are partly a result of resume-driven answerers.

tshaddox · on Oct 30, 2023

I think it's "iterative-StackOverflow-driven development" most of the time, and that actually causes the increased popularity of those resume keywords.

feoren · on Oct 30, 2023

This sounds like absolute hell. This is everything I hate about modern software development.

neycoda · on Oct 30, 2023

So, what tech service can I add to that bloated pipeline as a middle-man to get a fraction of a penny per transaction?

BonoboIO · on Oct 30, 2023

Resume Driven development - RDR

Never heard that before, but that’s so on point.

kmfrk · on Oct 30, 2023

... And a partridge in a pear tree.

Agentlien · on Oct 30, 2023

I feel like I've never seen anything even reminiscent of this bad in the twelve years I've worked as a software engineer. I really want to believe this pipeline as described is satire. Yet, somehow, it does not quite seem that way. This scares me. But also somehow explains why some companies contain so incredibly much more engineering staff than I can possibly explain looking at their output.