The issue is the bar keeps moving on where it makes sense to switch from one approach to the other.
I recently implemented the Adaptive Radix Tree from https://db.in.tum.de/~leis/papers/ART.pdf in Rust. It defines explicit node types with different trade offs for search and memory complexity in order to improve overall performance and depending on prefix sparsity, etc; a 4-child node 16, 48, and 256. The 16 and 48 are kept sorted (the 256 has implicit sorting because it's just byte->child) so binary search can be used (with SIMD optimizations)
But in reality, my benching showed linear scan outperforms this fairly consistently and the cost of keeping the keys sorted is not justified (except that it makes ordered iteration easier, so that's the trade-off) -- but evidently did not at the time the paper was written (2016 I think?).
I'm sure there's plenty of examples like this, and I'm sure we're likely using datastructures all over the place that were built under the assumption that linear scan was expensive, and optimizations were added to hash or sort data and those optimizations may actually be hindering rather than helping.
I recently implemented the Adaptive Radix Tree from https://db.in.tum.de/~leis/papers/ART.pdf in Rust. It defines explicit node types with different trade offs for search and memory complexity in order to improve overall performance and depending on prefix sparsity, etc; a 4-child node 16, 48, and 256. The 16 and 48 are kept sorted (the 256 has implicit sorting because it's just byte->child) so binary search can be used (with SIMD optimizations)
But in reality, my benching showed linear scan outperforms this fairly consistently and the cost of keeping the keys sorted is not justified (except that it makes ordered iteration easier, so that's the trade-off) -- but evidently did not at the time the paper was written (2016 I think?).
I'm sure there's plenty of examples like this, and I'm sure we're likely using datastructures all over the place that were built under the assumption that linear scan was expensive, and optimizations were added to hash or sort data and those optimizations may actually be hindering rather than helping.