Hacker News new | past | comments | ask | show | jobs | submit login

Great stuff! Thanks for publishing it. I'm looking forward to future posts.

I'm curious about how you're managing the data on a drive itself. Are you storing the blocks as individual files on a filesystem? Are you doing direct management of the block device itself? Something else?




Yup, as @jamwt says we directly manage the block device.

That said, up until somewhat recently our data really was written into 1GB extents stored as files on an XFS filesystem. This mostly worked fine but we switched away from this so that we could directly manage disk layout for SMR storage (a new type of hard drive that has bad random write performance), along with some performance improvements, increases in storage density, and some minor reliability improvements from avoiding the filesystem.

It takes a lot more operational tooling to directly manage a block device since you obviously can't use all the standard filesystem tools.


We're going to go into detail on that (the OSD component) in a future blog post... but short version, yep, it's a custom "filesystem" directly done on the block device.


Thats very interesting. When you post about that could you go into why you choose that over say large files (say 32 gb or so) that contain each of the collection of blocks? XFS or Ext4 can preallocate large chunks of the files beforehand, so the data looks nice on disk. A fully custom block level device sounds like a nice way to minimize overhead, guess it goes to show that extreme scale makes certain decisions that seem crazy from a support point of view start to make sense.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: