While this article wants to establish additional layers above the filesystem, I always wondered how comparable modern filesystems are to key-value datastores.
As far as I can see, they seem to be comparable to b+tree indexed key value stores. A key would e.g. be "/home/user/test.txt".
Thanks to the B+Tree "indexation" you can do a prefix scan and list folders (e.g. "ls /home/user/"--> all keys starting with "/home/user/").
In the case of e.g. ReiserFS they actually use B+Trees.
They have a caching layer managed by the OS. Most of them have journaling which would be the equivalent of a "write ahead log".
Map reduce based "view" generation can easily be done by pipes and utilities like grep. We might be even able to do some sort of simplistic filtering/views/relations using symlinks.
I guess the main difference is that they aren't optimized for this database-like behavior from a performance standpoint and that the network interfaces to them are SMB/AFP/NFS.
Facebook were using the filesystem for storing photo's and then moved to Haystack which is essentially an append-only log similar to BitCask. The problem with using the filesystem as a KV store is that for every item stored, you're storing a whole lot of filesystem specific meta-data: created, updated, permissions etc.
As far as I can see, they seem to be comparable to b+tree indexed key value stores. A key would e.g. be "/home/user/test.txt". Thanks to the B+Tree "indexation" you can do a prefix scan and list folders (e.g. "ls /home/user/"--> all keys starting with "/home/user/").
In the case of e.g. ReiserFS they actually use B+Trees. They have a caching layer managed by the OS. Most of them have journaling which would be the equivalent of a "write ahead log".
Map reduce based "view" generation can easily be done by pipes and utilities like grep. We might be even able to do some sort of simplistic filtering/views/relations using symlinks.
I guess the main difference is that they aren't optimized for this database-like behavior from a performance standpoint and that the network interfaces to them are SMB/AFP/NFS.