This article is capitalizing on the Crowdstrike incident. It was costly but a mistake. As a software engineer, I just know that's all it is. I don't think there is a upward trend of these mistakes because they are always trying to be careful and sometimes they also get careless. Some additional processes might be added to avoid it, but years later it may happen again somewhere else in another company. I don't think it's because of "software erosion." And the recovery was a costly day or two but it was fixed and we all went back to normal.
I worked on AOL 5.0. It did crash machines with a specific softmodem driver. The bug was in the driver, we had to work around it after the gold master release. We didn't have that specific machine/driver in the QA lab, but the execs all had laptops that uncovered the behavior.
The way for crowd strike to avoid their incident was adding a very basic (borderline trivial) step in the merge/release pipeline to make sure machines could still boot after running the to be deployed version.
That’s really not much overhead nor is it a novel or groundbreaking process. They chose not to do it or maybe were told about it but decided not to spend any engineering time on it.
There is definite enshittification of software happening all around us with companies unable to understand that an end goal of product development could be achieved and focusing on feature bloat to protect them from up-and-coming startups taking a piece of their cake. This means that both good features and bad ones get added and things have to change constantly making the entire end user experience worse. This complicates things on the softdev side as well as tech debt grows, architecture was made without taking into account some of these features, QA is harder to do well considering the larger surface area. So this leads us to this dystopian view of how things are and when a mistake happens an echo chamber could be easily formed that makes these views (software sucks) feel like postulates.
On the other hand we've never been surrounded by so much software in history and it keeps growing, and will keep growing and so far the earth is not collapsing. There's so much that depends on people typing code into their editors it's truly amazing we've reached this point. Keeping everything afloat in this new reality is increasingly difficult as many of these systems work together and require a broad understanding of many domains (not every product/company has budgets for multiple roles so you have one person that does infra/code/qa with a multitude of tools) to enable them to work without issues. So the number of interactions people have with code is increasing and therefore when problems with software that is used by a lot of customers come up it becomes *very* visible and feels like nothing is working. But in reality several thousands of microprocessors in very close proximity of these people keep chugging along and their phones, payment cards, headphones, monitors, tvs, speakers, smart *x*s, coffee makers, thermostats etc... are as reliable as they ever were with a lot more to offer so the other view could also be very realistic (software has never had this level of quality).