Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The blame here may indeed lie with whoever decided that reusing an old flag was a good idea. As anyone who has been in software development for any time can attest, this decision was not necessarily - and perhaps not even likely - made by a "developer."


9 times out of 10, I see developers making the mistakes that everyone seems to want to blame on non-technical people. There is a massive amount of software being written by people with a wide range of capabilities, and a large number of developers never master the basics. It doesn't help that some of the worst tools "win" and offer little protection against many basic mistakes.


You have to assume people will make mistakes.

A great book on that is this https://www.thenile.co.nz/books/sidney-dekker/the-field-guid...


A large number of developers never master the basics, that is true. But more interestingly, absolutely zero programmers can write a good amount of code that is free of bugs.

If your road to safety is bugfree code, it will end up in an accident sooner or later, 100% guaranteed.


For a group who so thoroughly despises bosses that operate on 'blame allocation', we spend a lot of time shopping around for permission to engage in reckless behavior. Most people would call that being a hypocrite.

Whereas I would call it... no, hypocrite works just fine.


At the company I work, we have a team that took 3 weeks and multiple tries to get an API response (JSON) capitalized properly (camelCase to PascalCase).

When I tried to talk to the tech lead about it, his response is that SAFe would have prevented the issue (it was discovered by another team who consumes their API).

Throughout the entire thing this tech lead maintained his team didn't do anything and that the problem was the process.

yeah, no. I have 25+ years of experience as a developer, it doesn't take 3+ weeks to fix the casing of a JSON property name. I eventually had to be the bad guy and tell them their work was unacceptable because they themselves couldn't recognize it. Only when I did it, I ran it up the chain because if the tech lead doesn't see the problem then I need someone who can help them see the problem.

For some people there's a "responsibility shield" that's so strong you can never get through to them.


Or at least not by a developer who has made that sort of mistake in the past.

I don't know what software engineering programs teach these days, but in the 1980s there was very little inclusion of case studies of things that went wrong. This was unlike the courses in the business school (my undergrad was CS major + business minor) nor I would presume what real engineering disciplines teach.

My first exposure to a fuckup in production was a fuckup in production on my first job.


I wonder if this code was written in c++ or similar, the flags were actually a bitfield, and they repurposed it because they ran out of bits.

Need a space here? Oh, let's throw out this junk nobody used in 8 years and there we go...


It is very hard to change the overall size of the messages, and there's a lot of pressure to keep them short. So it could have been a bitfield or several similar things... e.g a value in a char field


This sounds particularly plausible with it being high frequency trading. Those presumably have optimisations few other applications have


This is the first thing I thought of because otherwise this story doesn't make a lot of sense.


At the very least have a two deploys - actually removing the old code that relies on it and then repurposing it. Giant foot gun to do it all in one especially without any automated deploys.


Good point. Actually I think I'll treat this as a best practice in general when there's a transition


That assumes that you have a stable, reliable, quick process to roll out updates. Sounds like they didn't, so maybe they worked on the "oh better add this feature, it's our only chance this month" pattern.


>whoever decided that reusing an old flag was a good idea.

My understanding is that in high frequency trading, minimizing the size of the transmission is paramount. Hence re-purposing an existing flag, rather than adding size to the packet makes some sense.


Flag recycling is a task that should be measured in months to quarters, and from what I recall of the postmortem they tried to achieve it in weeks, which is just criminally stupid.

It's this detail of the story which flips me from sympathy to schadenfreude. You dumb motherfuckers fucked around and found out.


I doubt any manager or VP cares or knows enough about the technical details of the code to dictate the name that should be used for a feature flag, of all things.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: