Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> "... For example, if a system makes use of two independent components, each with an availability of 99.9%, the resulting system availability is >99.999% ... "

This does not seem correct.



Independent and redundant components is the missing context the quote was taken from, wherein the quote is correct.


What they probably mean is that if each component can fail independently with a probability of 0.001 (0.1%) then the probability of both of them failing is 0.001 * 0.001 = 0.000001 (0.0001%)

If the system depends on just one of the components working then 1 - 0.000001 = 0.999999 (99.9999‬%)


This is correct, however in reality completely independent components are very rare. Even things that seem independent and truly redundant e.g. jet engines of an airliner, are much more likely to fail after one of them fails. Therefore this line of reasoning must be applied with extreme care.


Indeed, contaminated fuel would do it.


I would not model the risk on fuel as part of the risk on jet engines.


Other correlated risks include weather, birds, thrust, age, time since last maintenance...

In fact, I'm having trouble coming up with any common cause of engine failure which isn't correlated.


They do put extensive precautions in place for over water duel engine flights: https://en.wikipedia.org/wiki/ETOPS

“Avoidance of multiple similar systems maintenance. Maintenance practices for the multiple similar systems requirement were designed to eliminate the possibility of introducing problems into both systems of a dual installation (e.g., engines and fuel systems) that could ultimately result in failure of both systems. The basic philosophy is that two similar systems should not be maintained or repaired during the same maintenance visit. Some operators may find this difficult to implement because all maintenance must be done at their home base.”

http://www.boeing.com/commercial/aeromagazine/aero_07/etops....


Manufacturing defect either material or user error maybe?


Many manufacturing defects affect whole batches of units.

People serious about preserving data with redundant arrays, tend to be careful to avoid using multiple drives from the same batch.

I vaguely recall a cloud backup provider losing customer data because they failed to do this. Annoyingly I can't find it on google.


Even in this case, after a failure of one engine, the other engine(s) are set to a higher thrust, which increases likelihood of their failure.


Unless someone explain or argue that this statistic, when a system with such high degree of availability let's say >=99.9% also can be said to have other properties beyond just the mere statistical nature. If not the resulting availability should be 99.8001%


I suppose it depends if the use/need is in serial or in parallel.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: