I was unpleasantly surprised by but thankful to have found eclecticlight.co’s fi...

dragontamer · 2024-08-27T15:00:36 1724770836

Bitrot is reasonably handled by erasure codes by simply having CRC32 checksums (or similar) verifying the parts.

If a piece has bitrotted away, then you throw away the whole segment.

CRC32 is closely related to ReedSolomon / Galois Fields. It's basically a repeated division + remainders in Galois Field. And as we all know: Division is very good at mixing up bits (true in normal math as well as Galois Fields).

The real benefit of cyclical codes is the guarantee to catch any burst error of size 32 or less (for a CRC32). You only get a chance of false negatives if the error region is larger than the CRC size.

------

Indeed: the whole erasure code / correction code thing has complex math constructs so that these tight guarantees can be made. (Be it CRC32 or ReedSolomon, or any old school biterror algorithm).

justsomehnguy · 2024-08-27T18:04:59 1724781899

> ecause bit rot scares me.

Use WinRAR/RAR recovery record for the important things.

There is one site what still mandates 5% RR for the archives because before the ubiquitous HTTPS the trashed in transit archives were the norm.

hcs · 2024-08-27T08:13:46 1724746426

It's my understanding that par2 is designed for missing files (parts of a multi part archive), not the uniform random bit rot corruption used in that article. I think it can recover a much larger corrupted or missing block, approaching the size of the parity files.

But yeah if that's your data loss model then par2 isn't the right approach. (Not sure what is.)

loeg · 2024-08-27T17:51:10 1724781070

I think something about the test methodology in that article is severely flawed.