Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tangentially, I did a bit of Rust work recently. I was sadly unable to find a concise credible answer to a rather elementary best-practices question: How does ownership interact with nested datastructures? Is it possible to build a heap tree without Boxing every node explicitly?


This question is a bit subtle, it depends on exactly what you mean. You could make a tree using only borrow checked references and the compiler would make sure that parent nodes go out of scope at the same time or before the child nodes they point to, but I don't think that's what you're talking about.

In general, if it's a datastructure where you have to use pointers, you'll have them Box'ed, but you would try to avoid that if you can. In your example of a heap, you'd want to use an array-based implementation, probably backed by a growable Vec, and use indexes internally. A peek function would still return a normal Rust reference to the data, and the borrow checker would make sure that you don't mutate the heap's backing array while that reference was still in use, etc.


I never thought about using a Vec for these, but that is a great idea for keeping the memory management sane for tree/linked lists.

One thing I would add that you need to be wary of destructors with large pointer data structures in Rust since it can easily stack overflow. When using Option<Box<T>> you need to be careful to call Option::take on the pointers in a loop to avoid stack overflow.


You'd do the same stuff you'd do in C++ here; allocate every node explicitly, use an arena, whatever you want.



Thanks. Saw that before, but the credibility/length ratio wasn't high enough to read it more carefully. It appears that we do have to Box/Rc/Arc nodes in a recursive datastructure. Doable, but a bit on the inconvenient side.

    struct Node {
        elem: i32,
        next: Option<Box<Node>>,
    }


All explanations start with why without any indirection, Node would be a recursive, infinitely large type. Therefore the Node must be a pointer. Ok. But then, Rust then forces you to answer this question: who will own the data referred to by the pointer? Consequently, who will be responsible for freeing it?

If you use a &mut Node as your pointer, you are attempting to answer those questions with "not me". Someone else has to own the Nodes. They've got to be somewhere on the stack or on the heap. There's nothing stopping you from defining the next pointer as Option<&'a mut Node>. The problem is actually constructing a list.

Not many answers tell you why you can't do this in practice. I agree that this is not explained well enough in general, because new Rustaceans don't intuitively reach for references so it probably doesn't come up much and they're hard to use for this. But it's not that hard to see why:

Imagine you try to allocate all the Nodes at once (e.g. an array), and then use &mut references into the pre-allocated array. In order to set one &mut Node's next pointer, you will have to hold another &mut Node to set it to. This means you need to acquire mutable references to array elements, in the order that you wish them to appear in the linked list. This is actually really tricky to do: slice::get_mut(index)'s returned reference borrows the entire slice, so it doesn't let you have a &mut reference to two nodes at the same time. You need smaller &mut [Node] slices, somehow.

slice::split_first_mut is one way (in order), but if you have an array and can only create a linked list in the order nodes appear in the array, what's the point? Just use the array! Any other compiler-checked access order scheme will also be so limiting that you should just use a data structure of that exact shape anyway. To use an arbitrary order, you're going to need unsafe, so you'd basically be writing C.

To be fair, there is basically one application of this, and it's to have a sparsely populated constant size array that needs to be iterated in order. I made a demo:

https://play.rust-lang.org/?version=stable&mode=debug&editio...

The other problem is that you can't resize the backing array: your &mut Node references would be invalidated.

For this reason, pre-allocated lists like these are usually done with indices instead of references. The overhead is one pointer + offset and then a bounds check when dereferencing.

---

The other solutions answer the ownership question like so:

- You can use Box, so that each Node (acting as a list head) owns the entire tail of the list, uniquely, such that no other list can also refer to it. The tail is freed when the head is, unless of course you detach it first (let tail = node.next.take();).

- You can Arc/Rc the nodes, so that each node has a pointer, but not a unique pointer, to the next node. These can be duplicated, so lists can exhibit structural sharing if you are comfortable with that. Because of the sharing, freeing the head does not necessarily free any/all of the tail.


An improvement, using the Node struct and apparently pushing the limits of borrowck: https://play.rust-lang.org/?version=stable&mode=debug&editio...

If you actually want this intrusive linked list functionality, consider using a real-world implementation like this: https://lib.rs/intrusive_collections


Thanks for the in-depth dive. My usecase is doing transformations over an AST: trees are immutable, but may become shared or dead deep in the middle of some complex transformation. Probably Rc<Node> is the reasonable approach, as Box is too constraining.


Of course. Arena allocation comes to mind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: