Things fail slower then faster than people expect.
Lots of people were waiting with baited breath for Twitter to fail the first week Musk bought it, but most of the failure conditions had been automated away...
Instead the reliability loss creep happened more slowly. The people that understood the edge conditions in the system were fired. Then new changes need implemented and the old, now not understood systems, were ripped out and new pieces put in. And lo and behold they have tons of failure conditions and edge cases that can knock out the system.
Even if your programming team wrote immaculate documentation on all these edge conditions, it can take an immense amount of time to both read them, and then fully understand them, and when you have a micromanager breathing down your neck for changes 10 minutes ago this is what happens.
Lots of people were waiting with baited breath for Twitter to fail the first week Musk bought it, but most of the failure conditions had been automated away...
Instead the reliability loss creep happened more slowly. The people that understood the edge conditions in the system were fired. Then new changes need implemented and the old, now not understood systems, were ripped out and new pieces put in. And lo and behold they have tons of failure conditions and edge cases that can knock out the system.
Even if your programming team wrote immaculate documentation on all these edge conditions, it can take an immense amount of time to both read them, and then fully understand them, and when you have a micromanager breathing down your neck for changes 10 minutes ago this is what happens.