> I don't know if the authors are here, but if they are - would you comment on fragmentation and the dangers of growing a filesystem past 95-98% full ?
Fragmentation isn't an issue in TFS, at all. Because it is a cluster-based file system. Essentially that means that files aren't stored contagiously, but instead in small chunks. The allocation is done entirely on the basis of unrolled freelists.
This does cause a slight space overhead (only slight, coming from the fact that metadata of the file is stored in the full form), but it completely eliminates any fragmentation.
I only have a basic understanding of harddisks/filesystems, but won't that slow down reading/writing on harddisks since the chunks won't be in order and close together?
> A good design would look at the state of the art and use the best techniques available. If the aim was research, then try one new thing, not a thousand.
That's what it does: It takes from many sources (although mainly ZFS).
TFS was created to speed up the development. The issue is that following the design specs makes it much slower to implement, and prevents a "natural" development (like, you cannot implement it like a tower, you need every component before completion). It was started[1] and got far enough to reading images, but implementing it took ages, so we decided to put it off for now.
This doesn't quite seem to follow? ZFS's pool model has supported flagged off features for a very long time - isn't the issue more that to do the things ZFS does you need to implement all the other components? And since you're planning to do a lot of what ZFS does...
The first section in the README is called "Design goals", with 13 items. None of them is "data integrity", and none of them even talks about validating the data or handling any failures aside from power loss.
By contrast, in the canonical slide deck on ZFS[1], the first slide talks about "provable end-to-end data integrity". In the paper[2], "design principles" section 2.6 is "error detection and correction".
I'm glad to hear that's also a focus for TFS. With ZFS, the emphasis on data integrity resulted in significant architectural choices -- I'm not sure it's something that can just be bolted on later. As a reader, I wouldn't have assumed TFS had the same emphasis. I think it's pretty valuable to spell this out early and clearly, with details, because it's actually quite a differentiator compared with most other systems.
> The first section in the README is called "Design goals", with 13 items. None of them is "data integrity", and none of them even talks about validating the data or handling any failures aside from power loss.
Firstly, it's ChaCha20. I don't think anybody in their right mind would advocate a 2 round ChaCha. Secondly, there _are_ steam cipher constructions to achieve the design requirements for something like TFS.
On the contrary, it makes it suitable for file systems! File systems are not block devices, they are data structures on top of block devices — it's the job of these data structures to keep stuff, such as data, inodes, and... keys, and IVs, and MACs, checksums, etc.
"If you’re encrypting a filesystem and not disk blocks, still don’t use XTS! Filesystems have format-awareness and flexibility. Filesystems can do a much better job of encrypting a disk than simulated hardware encryption can."
Edit 2: also check out bcachefs encryption design doc: http://bcachefs.org/Encryption/ (also not perfect, but uses proper AEAD — ChaCha20-Poly1305. I sent some questions and suggestions to the author, but received no reply :/)
SHA256 is very slow, and that's no surprise. It's cryptographic after all.
Here's a small list of usecases for non-cryptographic hash functions:
- Checksums and error correction codes, as long as there is no way to maliciously use this.
- Hash tables. These always use non-cryptographic hash functions.
- Bloom filters.
- Heuristic fingerprinting. They're not strong enough to be used for normal data fingerprints, but they can be used as a way to decide if two buffers are "probably equal" or "certainly not equal".
Hash tables are the main one. Cryptographic hash functions are almost never used in them. SipHash is a popular choice, but it is not cryptographic. That is a misunderstanding: It's a MAC function.
There's a huge difference between cryptographic and non-cryptographic. Note that blake2 has various length, whereas SeaHash is fixed to 64-bit (although I suppose it's not to hard to make a version with bigger length), and thus naturally collisions will happen.
If you need fingerprints, don't use SeaHash, but if you are looking to insert into e.g. a hash table, you shouldn't use BLAKE2. It's awfully slow for that.
> I don't know if the authors are here, but if they are - would you comment on fragmentation and the dangers of growing a filesystem past 95-98% full ?
Fragmentation isn't an issue in TFS, at all. Because it is a cluster-based file system. Essentially that means that files aren't stored contagiously, but instead in small chunks. The allocation is done entirely on the basis of unrolled freelists.
This does cause a slight space overhead (only slight, coming from the fact that metadata of the file is stored in the full form), but it completely eliminates any fragmentation.