Hacker News new | past | comments | ask | show | jobs | submit login

Yep. I'm confident the large cloud providers (Google, Amazon, etc) run enough SSDs that it's worth them having this data internally, but there's probably no motivation to share it, or maybe they see having the data as some sort of competitive advantage.



FWIW, Josef Bacik, one of the main developers of btrfs and a long-time Facebook employee, said many times that most Facebook servers use the crappiest SSDs they can buy.

I won't be delving into hour-long searches through the mailing lists to support a casual comment, but here's something I could dig out quickly:

https://lwn.net/Articles/824855/


> I won't be delving into hour-long searches through the mailing lists to support a casual comment

Don't worry fam, I got you.

https://lwn.net/ml/fedora-devel/03fbbb9a-7e74-fc49-c663-3272...

Source via: https://news.ycombinator.com/item?id=30443761


Not sure what the take away is from their experience.

Is it

1. Consumer grade/crappy SSDs are still reliable enough at scale and rarely fail that it is easier to manage occasional failures ?

2. The cost of engineering effort to keep things together with crappy ssds failing often is less compared to the expensive enterprise grade hardware ?

Or is it something else ?


Once you build a distributed system that's large, it has to tolerate routine node failures.

Once you have a fault tolerant system, and some idea of failure rates, you can then ask "is it worth having X units of less reliable hardware or Y units of more reliable hardware? How many will still be running after a year?"

You then find out that buying a few more cheap nodes compensates completely for the lower reliability.


One more step, and it will conjoin with Erlang's "Let it crash" approach.


This thread seems overly complex given that an engineer tasked with designing a RAID storage to replace an unplanned haberdashery would probably start with looking up what the requested acronym stands for.


> use the crappiest SSDs they can buy

That might be a bad idea, from personal experience. I bought a cheap SSD from a reputable vendor (Crucial) to upgrade someone's old laptop. Sometimes (10-20% of the time?), the laptop boots about as slowly (or worse) as with a mechanical HDD. The SSD is a model entirely without RAM cache, which seems to cause some really bad performance cliffs.


This is because you're missing the key tenet: massive parallelism. This works when you have 1000 drives, with significant data redundancy distributed among them. If two of them fail, and one of them is so slow that it can also be considered failed, it's not a big deal, you still have 997 adequately performing drives, and the system can easily handle the loss of 0.3% of the capacity.

If you only have one drive, try to make it as good as you can afford.


RAID expects non deterministic bad behaviors. A group of these drives receiving the same pattern of block accesses could exhibit the same behavior at the same time depending on how they were designed.. Similarly, I think it is usually done with the expectation of something worth treating as an irreversible failure, some RAIDs may perform at their worst for read when there is a slow response from a drive that isn't known to be bad, optimizing them to compete by always racing the correction with the last response would be inefficient if they weren't designed specifically for this defect.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: