Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google’s “best practices” lead them to deleting an entire customer’s $135 billion pension account [1]. I’m surprised anyone is still reading anything Google writes.

1. https://arstechnica.com/gadgets/2024/05/google-cloud-acciden...



You’re assuming that those systems were all implemented to the letter of that guide. That’s never the case. Often these type of guidelines are written to address recurring problems found in an organization.


If we should only read things written by organizations that make no mistakes, then we will never read anything.


> If we should only read things written by organizations that make no mistakes, then we will never read anything.

That was a “mistake” that should not have even been possible. If the pension fund had not used a multi cloud strategy the entire business would have been lost. A mistake is not configuring Kafka correctly and losing some data, deleting an entire account should not be given a pass.


The recent postmortem says they were able to recover from backups on gcp, so I don't think this is true.


> The recent postmortem says they were able to recover from backups on gcp, so I don't think this is true.

“UniSuper, an Australian pension fund that manages $135 billion worth of funds and has 647,000 members, had its entire account wiped out at Google Cloud, including all its backups that were stored on the service. UniSuper thankfully had some backups with a different provider and was able to recover its data, but according to UniSuper's incident log, downtime started May 2, and a full restoration of services didn't happen until May 15.”

Google didn’t recover the data, the customer recovered their data from a different cloud provider.


From https://cloud.google.com/blog/products/infrastructure/detail...

> This incident did not impact:

> Any other Google Cloud service.

> Any other customer using GCVE or any other Google Cloud service.

> The customer’s other GCVE Private Clouds, Google Account, Orgs, Folders, or Projects.

> The customer’s data backups stored in Google Cloud Storage (GCS) in the same region.

...

> Data backups that were stored in Google Cloud Storage in the same region were not impacted by the deletion, and ... were instrumental in aiding the rapid restoration.

Emphasis mine.

You're quoting, as far as I can tell, an ArsTechnica article that makes unsourced claims about backups being deleted, neither UniSuper's nor Google's previous statements ever mentioned anything about backups being deleted.


> were instrumental in aiding the rapid restoration.

I don’t call 13 days a rapid restoration. I also don’t trust Google’s post-mortem documentation more than an independent news organization to be honest about what really happened. Especially while Google is actively gaslighting their users about the errors in its AI search [1].

1. https://www.theverge.com/2024/5/24/24164119/google-ai-overvi...


It is, we'll go with, weird, to presume a random news article making baseless claims is correct over the, like, actual people who addressed the problem.

I'll reiterate, no one involved in the restoration (Unisuper or Google) ever said anything about Google's backups being deleted, in fact basically everything Google and Unisuper have said specific that it was only the VM config that was removed. Ars made up the thing about backups being deleted, which makes an exciting headline, but it doesn't appear at all reliable or based in reporting, just conjecture.


> Ars made up the thing about backups being deleted, which makes an exciting headline, but it doesn't appear at all reliable or based in reporting, just conjecture.

So you just label a reputable news outlet as fake news and then move on..?


I'm saying Ars is jumping to conclusions a bit too quickly. You can call that fake news if you want, I didn't.


“This is an isolated, ‘one-of-a-kind occurrence’ that has never before occurred with any of Google Cloud’s clients globally. This should not have happened. Google Cloud has identified the events that led to this disruption and taken measures to ensure this does not happen again.”

“UniSuper had backups in place with an additional service provider. These backups have minimised data loss, and significantly improved the ability of UniSuper and Google Cloud to complete the restoration.”

https://www.unisuper.com.au/about-us/media-centre/2024/a-joi...

Those quotes were pulled directly from UniSuper’s website. Google deleted an account, lost the data, and then took 13 days to recover the pension fund from data stored on another data provider. Maybe you should consider that your employment at Google is damaging your objectivity.


Can you please quote in particular where unisuper mentions that the backups at Google were deleted?

Like I keep saying, nothing in the primary sources supports the claims either that backups or that the accounts were deleted. You've jumped to a particular conclusion, and seem unwilling adjust that conclusion in light of new evidence.


It was quoted above, you seem unwilling to accept the facts.


You quoted two paragraphs, none of which mentioned either Google missing backups, or unisupers account being deleted. I'm fact, what you quoted aligns perfectly with what I've been suggesting the whole time.

You're making a strong claim, I'm asking you to source it specifically. Instead you're taking a statement from which you can draw multiple conclusions, and picking one (that has been contradicted repeatedly) and telling me I'm unwilling to accept the facts. But they aren't facts, they're your interpretation of vague statements.

I'm happy to accept facts. Facts like "Data backups that were stored in Google Cloud Storage in the same region were not impacted by the deletion" are very easy to understand and difficult to misinterpret. Do you disagree?

Like, even the additional reporting the ars article links to (https://danielcompton.net/google-cloud-unisuper, https://x.com/milesward/status/1792909048830214607?t=Vu__q1h...) basically contradicts both their and your conclusions. You're weirdly hung up on this.


“UniSuper had backups in place with an additional service provider. These backups have minimised data loss, and significantly improved the ability of UniSuper and Google Cloud to complete the restoration.”

Again, the same quote I already quoted above, direct from UniSuper’s website. They needed to use their backups at a different cloud provider, as GCP’s data wasn’t recoverable. I don’t know why you’re arguing so strongly against this.


That they used external backups doesn't actually imply anything about the GCS backups being unavailable. And Google's press release explicitly notes that unisuper used both. (And there's all kinds of reasons to have used both, both good and bad)

Put formally, we have statements that

    - 1. A and B exist
    - 2. A was used
    - 3. A and B were used
Your conclusion from these statements is that, because (2) A was used, therefore B does not exist. Hopefully putting it like this makes it clear why I'm so confused.


I’ll concede my point, well argued.


That was seven years later. Maybe the problem is that Google stopped reading what Google wrote.


Google : We will breach the rules we preach.


Right, because every line of code written by tens of thousands of Google engineers is being validated against every guidebook.


> That was seven years later. Maybe the problem is that Google stopped reading what Google wrote.

The problem is that it was never that good. Anyone who has used K8s at scale will tell you at length how it doesn’t scale. People should stop focusing on tech companies like celebrities and focus instead on domain problems related to their business.


The funny thing with k8s is that Google doesn't use it (except GKE, and there's a reason it's one cluster per customer).

Their internal tooling scales just fine, but all it shares with k8s is some of the underlying concepts. Unlike, say, Bazel, gVisor or Gerrit, which are the real thing (minus some secret sauce tied to internal infra). k8s is good software, and best-in-class when it comes to open source options, but the idea that it is "open source Borg" is silly.


> k8s is good software, and best-in-class when it comes to open source options

No it isn’t, it’s a solution in search of a problem that is needlessly complex, wastes engineering cycles on what could have been product development, and has violated every principal of orthogonal design.


Oh, completely ignoring anything anyone from Google ever writes again? This is akin to the cancel culture which we all know is how society should work. /s


> Oh, completely ignoring anything anyone from Google ever writes again? This is akin to the cancel culture which we all know is how society should work. /s

Maybe if Google focused on doing actual work instead of writing feel good engineering pieces, they wouldn’t have the Google graveyard and an unstable cloud offering that may spontaneously delete multi-billion dollar accounts.


Alphabet has 2T market cap, get your head out of your sitting place, lol.


> Alphabet has 2T market cap, get your head out of your sitting place, lol.

That same sort of thinking is what led to the downfall of yahoo.


The fall of yahoo was caused by out of touch C suite not engineering handbooks.


> The fall of yahoo was caused by out of touch C suite not engineering handbooks.

Are you saying Google’s c suite isn’t out of touch?


Are you saying Google engineering books are excellent?


Maybe. Times are different now.


> Maybe. Times are different now.

Are they?


Are they not?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: