(OK, well, it is maybe using hashes for the per-SST Bloom filter, but that's not what's interesting here. It's maybe also using hashes in the memtable in which case they're using it very inefficiently since the key encoding is not prefix-scan-friendly.)
I don't know anything about rocksdb, but this approach on surface level seems like it could be very slow? Wouldn't it be more efficient to encode the semver in a format more suitable to sorting
By default RocksDB uses a ByteWiseComparator to sort the keys in the SST. However, RocksDB allows you to provide any comparator you wish. So ultimately it will depend on the performance of the comparator that you implement.
Using RocksDB here seems fairly insane unless you need to keep (at least) several billion constantly-updating versions sorted. Otherwise you can just use SortedMap and regular Java comparators.
They might just want cheap disk persistence. I'm not a Java developer, but if SortedMaps are a memory-only data structure, persisting them reliably doesn't look an easy feat.
How would you persist a Sorted Map in Java easily? Serializing and deserializing is OK, but keep in mind that if the program crashes you still need to persist all the data that was saved before the crash (so you can't keep in memory and serialize at exit, or at a time interval)
Supporting insertion order is straightforward (but has some trade-offs) - you store the values in a backing array or linked list, and the hash table array stores pointers to the values.
You could do something similar with a sorted data structure backing the hashmap to get sorted order (a b-tree or something similar).
You could have the hash function preserve order at the cost of it being a very bad hash function. If you had n buckets, then the first 1/n elements in the keyspace would map to bucket 0, then next 1/n elements map to bucket 1, etc.
..somewhere, this person's algorithms teacher is pondering about his life choices.