Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Base VM images would be a rare and specific workload. One of the few cases dedupe makes sense. However you are likely using better strategies like block or filesystem cloning if you are doing VM hosting off a ZFS filesystem. Not doing so would be throwing away one of it's primary differentiators as a filesystem in such an environment.

General purpose fileserving or personal desktop/laptop use generally has very few duplicated blocks and is not worth the overhead. Backups are hit or miss depending on both how the backups are implemented, and if they are encrypted prior to the filesystem level.

Compression is a totally different thing and current ZFS best-practice is to enable it by default for pretty much every workload - the CPU used is barely worth mentioning these days, and the I/O savings can be considerable ignoring any storage space savings. Log storage is going to likely see a lot better than 6:1 savings if you have typical logging, at least in my experience.



> General purpose fileserving or personal desktop/laptop use generally has very few duplicated blocks and is not worth the overhead.

I would contest this is because we don't have a good transparent deduplication right now - just some bad compromises. Hard copies? Edit anything and it gets edited everywhere - not what you want. Symlinks? Look different enough that programs treat them differently.

I would argue your regular desktop user actually has an enormous demand for a good deduplicating file system - there's no end of use cases where the first step is "make a separate copy of all the relevant files just in case" and a lot of the time we don't do it because it's just too slow and wasteful of disk space.

If you're working with say, large video files, then a good dedupe system would make copies basically instant, and then have a decent enough split algorithm that edit's/cuts/etc. of the type people try to do losslessly or with editing programs are stored efficiently without special effort. How many people are producing video content today? Thanks to Tiktok we've dragged that skill right down to "random teenagers" who might hopefully pipeline into working with larger content.


But according to the article the regular desktop already has such a dedup system:

> If you put all this together, you end up in a place where so long as the client program (like /bin/cp) can issue the right copy offload call, and all the layers in between can translate it (eg the Window application does FSCTL_SRV_COPYCHUNK, which Samba converts to copy_file_range() and ships down to OpenZFS). And again, because there’s that clear and unambiguous signal that the data already exists and also it’s right there, OpenZFS can just bump the refcount in the BRT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: