Once you build a distributed system that's large, it has to tolerate routine node failures.
Once you have a fault tolerant system, and some idea of failure rates, you can then ask "is it worth having X units of less reliable hardware or Y units of more reliable hardware? How many will still be running after a year?"
You then find out that buying a few more cheap nodes compensates completely for the lower reliability.
This thread seems overly complex given that an engineer tasked with designing a RAID storage to replace an unplanned haberdashery would probably start with looking up what the requested acronym stands for.
Don't worry fam, I got you.
https://lwn.net/ml/fedora-devel/03fbbb9a-7e74-fc49-c663-3272...
Source via: https://news.ycombinator.com/item?id=30443761