Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's similar to the total outage of all Rogers services in Canada back on July 7th 2022. It was compounded by the fact that the outage took out all Rogers cell phone service, making it impossible for Rogers employees to communicate with each other during the outage. A unified network means a unified failure mode.

Thankfully none of my 10 Gbps wavelengths were impacted. Oh did I appreciate my aversion to >= layer 2 services in my transport network!



That's kind of a weird ops story, since SRE 101 for oncall is to not rely on the system you're oncall for to resolve outages in it. This means if you're oncall for communications of some kind, you must have some other independent means of reaching eachother (even if it's a competitor phone network)


That is heavily contingent on the assumption that the dependencies between services are well documented and understood by the people building the systems.


Are you asserting that Rogers employees needed documentation to know that Rogers Wireless runs on Rogers systems?


Rogers is perhaps best described as a confederacy of independent acquisitions. In working with their sales team, I have had to tell them where there facilities are as the sales engineers don't always know about all of the assets that Rogers owns.

There's also the insistence that Rogers employees should use Rogers services. Paying for every Rogers employee to have Bell cell phone would not sit well with their executives.

That the risk assessments of the changes being made to the router configuration were incorrect also contributed to the outage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: