Hacker News new | past | comments | ask | show | jobs | submit login
Battle testing ZFS, Btrfs and mdadm+dm-integrity (2020) (unixdigest.com)
20 points by psxuaw on March 27, 2024 | hide | past | favorite | 8 comments



Not a fan of the coverage of ECC in misunderstandings section.

>In any way you can run ZFS and Btrfs without using ECC memory, it's not a requirement.

Should be:

>You can run ZFS and Btrfs without using ECC memory, it is definitely possible, but you shouldn't run ANY filesystem without ECC memory.

It cannot be stressed enough. Friends don't let friends run computers without ECC memory.


Almost all small form factor NAS I am aware of which are affordable are non-ECC.

Almost all homebrew kits ship boards which are non-ECC.

You have to explicitly configure ECC as an option, eg when ordering a NUC.

If you want to make this true, you've got to confront the problem that people go to generic, cheap SBC platforms to run their home NAS, and they are almost never ECC.

My work ZFS is ECC.


>Almost all small form factor NAS I am aware of which are affordable are non-ECC.

>Almost all homebrew kits ship boards which are non-ECC.

>You have to explicitly configure ECC as an option, eg when ordering a NUC.

Wow. Placing such products in the market without any prominent warnings (or worse, making promises of reliability) should be considered criminal negligence.


Well. Need to qualify that. They may have things like on-die ecc but they aren't exposing state. So, it's down to trust.

https://forums.raspberrypi.com/viewtopic.php?t=358921 for instance


Anecdotal, but ever since running my workstation with >64GB ram, I’ve had stretches of unexplained behavior. Switching to ECC for the past 3yrs, these random errors no longer surface. It’s a 24/7 only reboot for patching machine.

I thought it was just a luxury feel good investment, but the longer I run without errors, the more real the problem seems.

The larger the memory capacity, the higher the probability of bit flips, so that likely plays a role.


With btrfs you can do raid5 data, but use raid1 for metadata. This actually works.

This post is testing a known bad thing, that comes with warnings, and confirming that yes the well known problems are real. Waste of time.


> With Btrfs many people immediately use the btrfs check --repair command when they experience an issue, but this is actually the very last command you want to run.

Ok. So what's the recommended approach? Dump the metadata and start hand culling from there?

If the tool doesn't exist,it just not yet ready.


As highlighted in the article itself (final notes --> update), the recommended approach is to not use btrfs at all.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: