Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It’s really the hardware block size that matters in this case (direct I/O). That value is a property of the hardware and can’t be changed.

NVMe drives have at least three "hardware block sizes". There's the LBA size that determines what size IO transfers the OS must exchange with the drive, and that can be re-configured on some drives, usually 512B and 4kB are the options. There's the underlying page size of the NAND flash, which is more or less the granularity of individual read and write operations, and is usually something like 16kB or more. There's the underlying erase block size of the NAND flash that comes into play when overwriting data or doing wear leveling, and is usually several MB. There's the granularity of the SSD controller's Flash Translation Layer, which determines the smallest size write the SSD can handle without doing a read-modify-write cycle, usually 4kB regardless of the LBA format selected, but on some special-purpose drives can be 32kB or more.

And then there's an assortment of hints the drive can provide to the OS about preferred granularity and alignment for best performance, or requirements for atomic operations. These values will generally be a consequence of the the above values, and possibly also influenced by the stripe and parity choices the SSD vendor made.



I've run into (specialized) flash hardware with 512 kB for that 3rd size.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: