Hacker Newsnew | past | comments | ask | show | jobs | submit | ed_elliott_asc's commentslogin

Let people take the risk - somethings in production are less important than others.

They have all the primitives. I think it's just that people are looking for a less raw version than AWS. In fact, perhaps many of these users should be using some platform that is on AWS, or if they're just playing around with an EC2 they're probably better off with Digital Ocean or something.

AWS is less like your garage door and more like the components to build an industrial-grade blast-furnace - which has access doors as part of its design. You are expected to put the interlocks in.

Without the analogy, the way you do this on AWS is:

1. Set up an SNS queue

2. Set up AWS budget notifications to post to it

3. Set up a lambda that watches the SNS queue

And then in the lambda you can write your own logic which is smart: shut down all instances except for RDS, allow current S3 data to remain there but set the public bucket to now be private, and so on.

The obvious reason why "stop all spending" is not a good idea is that it would require things like "delete all my S3 data and my RDS snapshots" and so on which perhaps some hobbyist might be happy with but is more likely a footgun for the majority of AWS users.

In the alternative world where the customer's post is "I set up the AWS budget with the stop-all-spending option and it deleted all my data!" you can't really give them back the data. But in this world, you can give them back the money. So this is the safer one than that.


> they're probably better off with Digital Ocean or something.

I run my home-dev stuff on Digital ocean and had to write my own cronjob which polls for usage cost and kills everything if I exceed X.

No better than AWS in terms of "don't financially ruin me if i'm targeted by a DDOS"


UK is a one day affair with voting booths typically open like 6 am to 10 pm


With the option to do a postal vote, or vote-by-proxy.


Tbh decompiling software and figuring out how it works isn’t easy but that is part of the fun :) - it’s the reason ive ended up following many of the weird paths in computing that I have


I use excel but not for financial modelling, I’ll use it


Do you predict the same for Anthropic? Hopefully they will stick around.


The problem, I think, is that IF OpenAI fail, they'll take with them a lot of other AI companies, simply because funding will be redirected away from the field entirely. If you're profitable, then you're probably going to be fine. If anything your operating costs will go down as there is less competition for staff and compute.


If we go purely by economics, then Anthropic belongs to the same category of LLM corporations - ones which have only LLM as a product. As opposed to the likes of Google, Microsoft, even Facebook. Sure, these LLM-first corporations have a very tiny lead in both technology and (LLM) brand recognition, but it is shrinking fast. I suspect that only companies which will bundle LLMs with other big products (and do it cheaply) will survive in the long run.


Do they really need a full mirror of production?


Every time I've worked somewhere without one, we've wanted it and wasted more developer hours than the cost of having it trying to reproduce issues while working around the differences in the environments.


It’s a bit of a guess though isn’t it?


It's the most plausible, fact-based guess, beating other competing theories.

Understaffing and absences would clearly lead to delayed incident response, but such an obvious negligence and breach of contract would have been avoided by a responsible cloud provider, ensuring supposedly adequate people on duty.

An exceptionally challenging problem is unlikely to be enough to cause so much fumbling because, regardless of the complex mistakes behind it, a DNS misunderstanding doesn't have a particularly large "surface area" for diagnostic purposes and it is supposed to be expeditely resolvable by standard means (ordering clients to switch to a good DNS server and immediately use it to obtain good addresses) that AWS should have in place.

AWS engineers being formerly competent but currently stupid, without organizational issues, might be explained by brain damage. "RTO" might have caused collective chronic poisoning, e.g. lead in drinking water, but I doubt Amazon is so cheap.


> An exceptionally challenging problem is unlikely to be enough to cause so much fumbling because, regardless of the complex mistakes behind it, a DNS misunderstanding doesn't have a particularly large "surface area" for diagnostic purposes and it is supposed to be expeditely resolvable by standard means (ordering clients to switch to a good DNS server and immediately use it to obtain good addresses) that AWS should have in place

You seem to be misunderstanding the nature of the issue.

The DNS records for DynamoDB's API disappeared. They resolve to a dynamic bunch of IPs that constantly change.

A ton of AWS services that use DynamoDB could no longer do so. Hardcoding IPs wasn't an option. Nor could clients do anything on their side.


> a DNS misunderstanding doesn't have a particularly large "surface area" for diagnostic purposes and it is supposed to be expeditely resolvable by standard means (ordering clients to switch to a good DNS server and immediately use it to obtain good addresses)

Did you consider that DNS might’ve been a symptom? If the DynamoDB DNS records use a health-check, switching DNS servers will not resolve the issue and might make it worse by directing an unusually high volume of traffic at static IPs without autoscaling or fault recovery.


> It's the most plausible, fact-based guess, beating other competing theories.

"My wildly conjectural and self-serving theory is not only correct, it is the most correct".

Lol perfectly represents the arrogance of hn.


The article describes evidence for a concrete, straightforward organizational decay pattern that can explain a large part of this miserable failure. What's "self-serving" about such a theory?

My personal "guess" is that failing to retain knowledge and talent is only one of many components of a well-rounded crisis of bad management and bad company culture that has been eroding Amazon on more fronts than AWS reliability.

What's your theory? Conspiracy within Amazon? Formidable hostile hackers? Epic bad luck? Something even more movie-plot-like? Do you care about making sense of events in general?


My theory is someone fucked up. There’s literally no information that gives us any additional insight to what happened yet.


We've witnessed someone repeatedly shoot themselves in the foot a few months ago. It is indeed a guess that it may cause their current foot pain, but it is a rather safe one.


Pretty sure you don’t mean crossdressers!

Codeweavers?


Little of column A, little of column B ;) This was a fun day in the office: https://www.codeweavers.com/blog/jwhite/2011/1/18/all-dresse...


I’m completely paranoid about claude messing with my .git folder so I push regularly


For the same reason, I run OpenCode under Mac's sandbox-exec command with some rules to prevent writes to the .git folder or outside of the project (but allowing writes to the .cache and opencode directories).

sandbox-exec -p "(version 1)(allow default)(deny file-write* (subpath \"$HOME\"))(allow file-write* (subpath \"$PWD\") (subpath \"$HOME/.local/share/opencode\"))(deny file-write* (subpath \"$PWD/.git\"))(allow file-write* (subpath \"$HOME/.cache\"))" /opt/homebrew/bin/opencode


What sort of issues do you get debugging?

My experience of .NET even from version 1 is that it has the best debugging experience of any modern language, from the visual studio debugger to sos.dll debugging crash dumps.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: