Disrespect to every CIO to make their business depend on a single operating system, running automatic updates of system software without any canaries and phased deployments.
While I believe Linux is a more reasonable operating system than Windows, shit can happen everywhere.
So if you have truly mission critical systems you should probably have more have at least 2 significantly different systems, each of them being able to maintain some emergency operations independently. Doing this with 2 Linux distros is easier than doing it with Linux and Windows. For workstations Macs could considered, for servers BSD.
Probably many companies will accept the risk that everything goes down. (Well, they probably don't say that. They say maintaining a healthy mix is too expensive.)
In that case you need a clearly phased approach to all updates. First update some canaries used by IT. If that goes well update 10% of the production. If that goes well (well, you have to wait until affected employees have actually worked a reasonable time) you can roll out increasingly more.
No testing in a lab (whether at the vendor or you own IT) will ever find all problems. If something slips through and affects 10% of your company it's significantly different from affecting (nearly) everyone.
What makes you think windows is the only alternative? Have you never heard about Gnu Hurd?
More seriously I am not saying you should run some critical services on menuetos or riscos but the BSDs are still alive and kicking as well as illumos and its derivatives. And yes I think a bit of diversity allows some additional resilience. It may necessitate more workforce but imho it is worth the downsides.
Presumably they do test their updates, they're just maybe not good enough tests.
The ideal would be to do canary rollouts (1%, then 5%, 10% etc.) to minimise blast radius, but I guess that's incompatible with antiviruses protecting you from 0-day exploits.
While I'm usually a proponent of update waves like that, I know some teams can get loose with the idea if they determine the update isn't worth that kind of carefulness.
Not saying CS doesn't care enough but what may be a minor update to the team that did this and not necessary for a slow rollout is actually something that really should be supervised in that way.
Our worst outage occurred when we were deploying some kernel security patches and we grew complacent and updated the main database and it's replica at the same time. We had a maintenance with downtime anyway at the same time, so whatever. The update worked on the other couple hundred systems.
Except, unknown to us, our virtualization provider had a massive infrastructural issue at exactly that moment preventing VMs from booting back up... That wasn't a fun night to failover services into the secondary DC.
The issue is update rollout process, lack of diversity of these kind of tools in the industry, and the absolute failure of the software industry to make decent software without bug and security holes.