Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think I understand. The idea would be to have one HashMap<tKey, tValue> that holds the objects themselves, and then a secondary HashMap<tKey2, *tValue> (with a star this time) that indexes on some other key and points to values stored directly in the first map?

What's the benefit of doing that, compared to making both the HashMaps store pointers to independently allocated objects on the heap, such that insertions into one map never invalidate the other map? Is the hope to avoid paying the cost of an extra pointer dereference when we're using the first map? Or does independently allocating each object hurt cache locality or something like that?



> Is the hope to avoid paying the cost of an extra pointer dereference when we're using the first map? Or does independently allocating each object hurt cache locality or something like that?

Both.

In practice, I probably wouldn’t use a hashmap for the first container that actually owns these items. When I do expect gigabytes of data, in C++ I use something like vector<vector<tValue>>, where the inner vectors are of the same fixed size (except for the last one), e.g. 2-16MB RAM / each. If I need to erase elements, I include a free list such as this one: https://github.com/Const-me/CollectionMicrobench/blob/master...

But the exact container is not that important here. If you don’t have that many values, it can as well be a standard vector.

The point is, C++ allows composing these containers making higher-level ones, such as this indexed array example, using pointers to link individual items across them. This feature allows building sophisticated and efficient data structures that are still possible to reason about.


Right, and I don't understand why you think that same exact approach wouldn't work in Rust either. If you have a `Vec<Vec<tValue>>`, then you can spread all the raw pointers you want everywhere without any additional boxing of `tValue`, and you know exactly when those pointers might become invalidated: whenever you call an `&mut` method on your `Vec<Vec<tValue>>` (or rather, on one of the interior `Vec<tValue>`s). Because of that, you can even build safe abstractions on top of such data structures such that your callers can't possibly misuse it (without themselves using `unsafe`).

The technique of giving stable addresses to things by stuffing them into vectors isn't unique to C++. People do it in Rust too: https://github.com/SimonSapin/rust-typed-arena/blob/master/s...


The exact container is not that important here. The point is, C++ allows composing these containers making higher-level ones, such as this indexed array example.

They can be standard, third-party, my own, I still can compose them.

About my particular example, I’m not sure you can easily implement a free list in rust, to reuse space from de-allocated items. Especially if these items have non-empty constructor and destructor.


> The point is, C++ allows composing these containers making higher-level ones, such as this indexed array example.

What I---and others---are trying to tell you is that it's perfectly possible in Rust too. I don't think you've pointed out anything that isn't possible in Rust. My previous comment was exactly about composing containers to make higher-level ones.

Have you tried building such things? Did you get stuck? Maybe someone can help.

> About my particular example, I’m not sure you can easily implement a free list in rust, to reuse space from de-allocated items. Especially if these items have non-empty constructor and destructor.

I don't see any reason why implementing a free list in Rust wouldn't be possible either.


It seems like one of Const-me's objections is that Rust data structures like HashMap don't document a lot of guarantees about when they would and wouldn't invalidate unsafe interior pointers. That said, for Vec in particular, Rust actually makes a ton of guarantees about its layout (more that C++ std::vector I think): https://doc.rust-lang.org/std/vec/struct.Vec.html


Correct.

While vectors are comparable, C++ also guarantees a lot about the rest of the containers. E.g. unordered associative containers never expire pointers to keys or values. Linked lists never expire pointers nor iterators.

In C++ I can create an efficient LRU cache in a dozen lines of code, combining list<const tKey* > with unordered_map<tKey, struct{tValue, list<const tKey* >::iterator}> (this implies tKey is not an int, otherwise list<tKey> is more efficient). Rust’s built-in LinkedHashMap had to reimplement a linked list instead.


Thanks for the example -- it was hard for me to picture what was going on before. I wonder if the unstable "Placer" APIs will make the standard LinkedList more capable of this sort of thing, but I'm not really familiar with them. Still, I suspect it'll always require `unsafe`.


> I don't see any reason why implementing a free list in Rust wouldn't be possible either.

Is placement new available in rust stable?


Why do you think placement new is necessary to implement a free list?


Object size possibly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: