The most bizarre OpsGenie story was how in 2022, this tool was down for 2 weeks for hundreds of unlucky companies that were Atlassian customers. This was at a time when JIRA had an outage impacting a small percentage of their customer base - but still in the hundreds of organizations (with around tens of thousands of users.)
While most companies can operate for some time without JIRA: losing your paging service means you're flying in the dark. And yet, Atlassian did not prioritize restoring OpsGenie.
I covered the details at the time [1]. To this date, this incident is a real head-scratcher and makes me wonder if Atlassian has internalized how much more critical an incident alerting software is, compared to a ticketing software (JIRA) or wiki (Confluent).
While most companies can operate for some time without JIRA: losing your paging service means you're flying in the dark. And yet, Atlassian did not prioritize restoring OpsGenie.
I covered the details at the time [1]. To this date, this incident is a real head-scratcher and makes me wonder if Atlassian has internalized how much more critical an incident alerting software is, compared to a ticketing software (JIRA) or wiki (Confluent).
[1] https://newsletter.pragmaticengineer.com/i/52148641/what-atl...