When you insert in sorted order into a b-tree (as you do with timestamps), later...

kscaldef · on May 13, 2010

Typically, one expects a b-tree lookup to require at most 1 disk seek (for the leaf-nodes), and ideally your indices are fully in memory. So, color me skeptical that fragmentation of the index itself is to blame here.

leif · on May 13, 2010

Inserting in sorted order is not a typical case, and I strongly doubt your "at most 1 disk seek" claim but my memory is fuzzy so you may be correct.

You're definitely right that the cache should be hot though, so I am now also a skeptical hue.

kscaldef · on May 13, 2010

One of the primary virtues of b-trees is that they make efficient use of disk. Even if the full data structure is too large to fit in memory, because of the large branching factor you can usually fit all the interior nodes in memory. So, in a lookup, only the final access of the leaf node requires a disk access. Of course, traversal may require additional seeks and reads.