The best solution is to use a 0-copy format like flatbuffers, google's zerocopy ...

Ericson2314 · on Feb 22, 2020

In the spirit of my other comment https://news.ycombinator.com/item?id=22375979 it would be nice to say the layers broken out into separately usable abstractions. Ideally, this means layering, K and V type params, and no loss of performance.

> Prefix encoding and suffix truncation

At the very least, you can do all this stuff in the key is like [u8]/Vec<u8> case. But maybe also [T]/Vec<T>? I love to look at my existing monomorphic algorithms, and think "what traits would this require to make this as abstract as possible"? Almost never is the answer "sorry, it cannot be generalized at all".

> All K and V types must then be serializable and deserializable....and if I do anything like this, it will be the approach I take.....

I agree with your own traits. Per the above your algorithms come first, not other people's abstractions. But I don't know why you can't just do that and ignore serde entirely. Serde becomes someone else's problem, and if they don't want to bother with it you will provide the [u8]/Vec<u8> instances.

> And in the mean time I'm more tempted to just point people to the 3 solutions above that let them view their bytes as structured data without many deserialization costs.

Agreed. If you data structure has a pointer, that should be a foreign key in my book (though not necessarily in the same direction!). It's best if you can just memcopy the row, more or less.

[I say the pointer thing not only because marshaling cost, but also data modeling. Huge rows / json blogs just don't make sense to me. There almost surely is some structure to the nested data, and that deserves to be formalized and enforced in its own index. Ignore what they told you in SQL class. Indexes express/reify invariants, and foreign keys act through indices.]