> a surprisingly small amount of water ingress would trip a breaker while leaving the racks in good order.
If that were the case they wouldn't be saying "There is no current ETA for recovery," and "it is expected to be an extended outage. Customers are advised to failover to other regions."
There's a lot more to a datacenter building than just the servers sitting on racks. In particular here there was a fire in the power-serving infrastructure (caused by the flood presumably). So nearly all of those servers could be totally fine, just off, but if the power distribution network in the building is literally fried, that's gonna take a long time to fix.
“Test in a live environment before the region becomes open to customers” is a test that’s not entirely representative for “the region had an emergency shutdown with customers on it.” And the latter is something that you can’t reliably test obviously - unless you decide to crash a whole region in live traffic.
I’m sure they have checklist and procedures, but an unknowable laundry list of things will go wrong.
If that were the case they wouldn't be saying "There is no current ETA for recovery," and "it is expected to be an extended outage. Customers are advised to failover to other regions."