> I won't be delving into hour-long searches through the mailing lists to suppor...

devsda · 2025-02-19T14:40:02 1739976002

Not sure what the take away is from their experience.

Is it

1. Consumer grade/crappy SSDs are still reliable enough at scale and rarely fail that it is easier to manage occasional failures ?

2. The cost of engineering effort to keep things together with crappy ssds failing often is less compared to the expensive enterprise grade hardware ?

Or is it something else ?

pjc50 · 2025-02-19T15:04:03 1739977443

Once you build a distributed system that's large, it has to tolerate routine node failures.

Once you have a fault tolerant system, and some idea of failure rates, you can then ask "is it worth having X units of less reliable hardware or Y units of more reliable hardware? How many will still be running after a year?"

You then find out that buying a few more cheap nodes compensates completely for the lower reliability.

nine_k · 2025-02-19T15:49:53 1739980193

One more step, and it will conjoin with Erlang's "Let it crash" approach.

snailmailstare · 2025-02-19T17:41:57 1739986917

This thread seems overly complex given that an engineer tasked with designing a RAID storage to replace an unplanned haberdashery would probably start with looking up what the requested acronym stands for.