From the Cloudflare incident: > Cloudflare’s critical Workers KV service went of...

voytec · 2025-06-12T20:18:37 1749759517

> outage of a 3rd party service that is a key dependency.

Good to know that Cloudflare has services seemingly based on GCP with no redundancy.

londons_explore · 2025-06-12T22:23:48 1749767028

Probably unintentional. "We just read this config from this URL at startup" can easily snowball into "if that URL is unavailable, this service will go down globally, and all running instances will fail to restart when the devops team try to do a pre-emptive rollback"

__turbobrew__ · 2025-06-13T03:52:14 1749786734

After reading about cloudflare infra in post mortems it has always been surprising how immature their stack is. Like they used to run their entire global control plane in a single failure domain.

Im not sure who is running the show there, but the whole thing seems kinda shoddy given cloudflares position as the backbone of a large portion of the internet.

I personally work at a place with less market cap than cloudflare and we were hit by the exact same instances (datacenter power went out) and had almost no downtime, whereas the entire cloudflare api was down for nearly a day.

fruit_snack · 2025-06-19T06:12:28 1750313548

Nice job keeping your app up during the outage but I'm not sure you can say "the whole thing seems kinda shoddy" when they're handling the amount of traffic they are.

tibbar · 2025-06-12T22:57:51 1749769071

What's the alternative here? Do you want them to replicate their infrastructure across different cloud providers with automatic fail-over? That sounds -- heck -- I don't know if modern devops is really up to that. It would probably cause more problems than it would solve...

arccy · 2025-06-12T23:04:34 1749769474

They're a company that has to run their own datacenters, you'd expect them to not fall over when a public cloud does.

hplk · 2025-06-12T23:32:01 1749771121

I was really surprised. The dependence on another enterprise’s cloud services in-general I think is risky, but pretty much everyone does it these days, but I didn’t expect them to be.

calvinmorrison · 2025-06-13T00:02:14 1749772934

well at some level you can contract deploy private instances of clouds as well.

UltraSane · 2025-06-13T03:26:24 1749785184

AWS has Outpost racks that let you run AWS instances and services in your own datacenter managed like the ones running in AWS datacenters. Neat but incredibly expensive.

voytec · 2025-06-13T01:50:49 1749779449

> What's the alternative here? Do you want them to replicate their infrastructure

Cloudflare adverises themselves as _the_ redundancy / CDN provider. Don't ask me for an "alternative" but tell them to get their backend infra shit in order.

ghshephard · 2025-06-13T00:35:37 1749774937

There are roughly 20-25 major IaaS providers in the world that should have close to dependency on each other. I'm almost certain that cloud flare believe that was their posture, and that the action items coming out of this post mortem will be to make sure that this is the case.

somanyphotons · 2025-06-12T23:32:48 1749771168

I would expect them to not rely on GCP at all

ProAm · 2025-06-12T23:47:43 1749772063

Google is an advertising company not a tech company. Do not rely on them performing anything critical that doesn't depend on ad revenue.

dylan604 · 2025-06-13T00:06:52 1749773212

What does that make Amazon?

tapoxi · 2025-06-13T00:14:11 1749773651

A cloud services company. AWS is much bigger than Amazon retail at this point.

arghwhat · 2025-06-12T23:18:46 1749770326

Redundancy ≠ immune to failure.

bravetraveler · 2025-06-12T21:25:28 1749763528

Content Delivery Thread