Responsibility lies on large numbers of processes, teams, and people.
I thought this blog might have some substance about proper postmortem investigations and how to evaluate and address the circumstances that led to a failure like this, but it has none of that. It’s just a very angry rant about CEOs and middle management. The premise is that engineers can’t bear any responsibility for their actions because they don’t get “respect”
This has to be the 10th time I’ve seen arguments that “blame” is the right action in this case, but with the key exception that we’re only allowed to blame people other than the engineers. The last article was a lengthy rant about how it’s actually QA’s fault and engineers shouldn’t be expected to ensure their own code is correct, therefore engineers are blameless.
This is empty calories for people who like ragebait, but nothing more.
Nah, that's not what the blog post says. It says engineers aren't given sufficient responsibility, so that they can assume the respective blame, when something goes wrong.
If an engineer says it would take 1 month to build a feature so it's sufficiently reliable, and the manager says "nah that's ridiculous, you have 1 week", then when the cobbled together feature breaks in production the manager should take the blame, since they effectively took the responsibility away from the engineer, upon themselves.
I used to think that way, until I had my own junior developers, and felt like that guy trying to bake a cake with three beagles: They veer off, constantly, in all directions.
Developers deep in the trenches tend to have a bad feeling for business requirements or constraints; coupled with a knack for perfectionism and premature optimisation, that really often results in ridiculous time frames that are just plain unrealistic and would ruin the organisation long term.
I don’t have any profound insights, though: The only sane mantra can be keeping things in balance. Too much management, you drive your devs insane; too much engineer control, and the architecture astronauts reinvent the wheel every other day.
It's hard to tell whether perfectionism means doing a pointless refactor to cloud microservice event-driven k8s buzzword Rust WASM-on-the-server sharded graph databases, or whether it means spending 30 minutes putting a password on that MongoDB instance your data science team wants to load prod data into.
Yeah, exactly. That’s why you want a culture of collaboration between engineers and managers to come together and decide what is important. It’s just hard to keep that up in practice, especially if a company grows.
> Developers deep in the trenches tend to have a bad feeling for business requirements
In my experience, this is because the developers are removed from interacting with the business. How are they supposed to make good decisions if they don't talk to their customers and understand what they are aiming for?
And I don't mean have the business folks show a roadmap once a year.
I have a similar opinion of being micromanaged. The micromanager is like a chaos demon that keeps pointing me in random directions. I lose all internal vision / intuition and turn into an unhappy task robot.
> If an engineer says it would take 1 month to build a feature so it's sufficiently reliable, and the manager says "nah that's ridiculous, you have 1 week"
This is just standard corporate accountability avoidance on the side of the engineer though. Most people don’t want to be accountable for any risk so they advise against it, or give impractical advice, so that somebody else has to make the decision and hold the accountability.
The blog responds to a very particular aspect of the fallout from the crowdstrike outage. The "Responsibility lies on large numbers of processes, teams, and people" was actually addressed in the article. It makes the case that executives claim that responsibility is correlated to pay. All the author asks in this case is for them to walk the walk.
The premise of the post was a response to the ridiculous claim that when something goes bad, we need to blame the engineer(s) who pressed the button.
I tried, through rant, demonstrate that there are other people to blame, starting from politicians who are incompetent in what they do, to CEOs who get compensated for taking the risk, to managers who cut corners, etc.
The culmination of the post is that if you want o blame someone, you might as well blame any of the involved parties. But instead, if we want to prevent such issues in the future, we need to understand that the entire process or broken, rather than throwing individuals under the bus.
The only bone I’d pick with the article is blaming regulations. The regulations in question rarely say anything particularly boneheaded. Blanket compliance culture interprets those regulations in boneheaded ways. Because to do it any other way would be much more expensive.
Basically, when a major incident occurs the "correct" action is to throw people in jail. Those people, depending on your political persuasion, are either the person who pushed the button, middle managers (because they're the convenient scapegoats), or the C-suite.
> Those people, depending on your political persuasion, are either the person who pushed the button
That’s not how it works in any industry, ever. A single person can’t launch nukes, blow up a reactor, collapse a bridge, or otherwise cause billions of dollars in damages by accidents pressing one button.
I think we're in agreement--a fat finger shouldn't be able to cause a disaster--but there often seems to be a sentiment of kill them all and let god sort them out.
So there will be congress public hearings on this, and the CEO will probably be called to testify. The company has been heavily outsourcing development to India, and it's not the kind of cultural environment where a Developer is going to push back and request more time for testing.
CEO will of course blame some low level employees who did not follow procedures...
I thought this blog might have some substance about proper postmortem investigations and how to evaluate and address the circumstances that led to a failure like this, but it has none of that. It’s just a very angry rant about CEOs and middle management. The premise is that engineers can’t bear any responsibility for their actions because they don’t get “respect”
This has to be the 10th time I’ve seen arguments that “blame” is the right action in this case, but with the key exception that we’re only allowed to blame people other than the engineers. The last article was a lengthy rant about how it’s actually QA’s fault and engineers shouldn’t be expected to ensure their own code is correct, therefore engineers are blameless.
This is empty calories for people who like ragebait, but nothing more.