It sounds like a more inherent result of ZFS being copy-on-write; if you run a p...

It sounds like a more inherent result of ZFS being copy-on-write; if you run a pre-allocated fixed size database on top of it which is applying random writes to existing records it's going to heavily fragment over time where a write-in-place filesystem would not.

I think it's fair to examine this behavior without the very-much mediating features such as ARC, since ideally we want ARC boosting performance well above the block storage media, not just earning back lost performance from copy-on-write.

I think the key take-away is if you have spinning rust and expect to sequentially read from disk a large database table by primary key, to the extent old records have been updated they will be fragmented, cause unexpected seeks killing your read throughput, and there's no way to defrag it without taking the database offline.

It sounds like a serious factor to consider depending on the particular workload your database expects.