Turns out sometimes people other than Linus have more experience with IO than Linus.
I think there's pretty good reasons to go for DIO for a database. But only when there's a good sysadmin/DBA and when the system is dedicated to the database. There's considerable performance gains in going for DIO (at the cost of significant software complexity), but it's much more sensitive to bad tuning and isn't at all adaptive to overall system demands.
Yes, more than anything I'm amused by the fact that when you do the Linus-approved thing in 2007 it leaves you in this terrible situation in 2019, and when the other kernel experts rub their heads together their solution is to abandon the gentle advice from 10 years earlier.
Yeah, which is one of the reasons PostgreSQL went with buffered I/O, not to have to deal with this complexity. And it served us pretty well over time, I think.
I don't think that's really true. It worked well enough, true, but I think it allowed us to not fix deficiencies in a number of areas that we should just have fixed. IOW, I think we survived despite not offering DIO (because other things are good), rather than because of it.
Yes - from a purely technical point of view, DIO is superior in various ways. It allows tuning to specific I/O patterns, etc.
But it's also quite laborious to get right - not only does it require a fair amount of new code, but AFAIK there is significant variability between platforms and storage devices. I'm not sure the project had enough developer bandwidth back then, or differently - it was more efficient to spend the developer time on other stuff, with better cost/benefit ratio.
I think there's pretty good reasons to go for DIO for a database. But only when there's a good sysadmin/DBA and when the system is dedicated to the database. There's considerable performance gains in going for DIO (at the cost of significant software complexity), but it's much more sensitive to bad tuning and isn't at all adaptive to overall system demands.