Hacker News new | past | comments | ask | show | jobs | submit login

> Pointers are still faster.

Depending on cache effects, indexes might or might not be faster than pointers. With indexes into an array, the nodes or edges will be sequential in memory, which depending on their size and access patterns might increase the cache hit rate. Furthermore, while pointers will always be 8 or 4 bytes, indexes can be as small as 2 or even 1 byte for smaller graphs (reducing structure sizes and potentially leading again to a higher cache hit rate).

As for the costs of indexing, on x86 a single instruction can add the array base, the index, and a constant offset, and do a load or store from/to the resulting address. Other architectures might need a few more instructions, but that is dwarfed by the cost of a cache miss, which can be hundreds of instructions.

Another cost is the bounds check for every indexing into the array, which the compiler can't elide because it can't easily prove that the index is within the array bounds. That is the main reason you saw "unsafe" code on the petgraph crate; there are places where the programmer knows the indexes are within the bounds, since they came from a trusted place (the graph itself), but the compiler isn't smart enough to prove it, so the programmer manually bypasses the array bound checks in these cases.

All in all, I wouldn't know a priori which would be faster for a particular use case, pointers or array indices. I'd have to benchmark first.

> Also with arrays it’s expensive to reduce RAM usage after a lot of nodes were removed.

True, the "array indexes" approach is not as good for algorithms which need to remove many nodes (or edges, depending on how they're represented) from the graph. As long as you don't need the indexes to be stable across deletions, you can use a simple trick to make deletions cheaper (move the last element of the array into the newly freed place, so all empty places are at the end of the array), but that trick can't be used if you need the indexes to be stable (because they're referenced from outside the graph).




Your point about reducing pointer size if the size of allocation pool is smaller makes sense. But I think it is not fair or realistic to imply that simply using an array the size of all RAM starting at zero leads to more fragmentation than referencing all of RAM relative to some other point. Assuming same algorithms working on same sized data sets. The relative locations of objects is not dependent on where the coordinate origin is, although choosing a good zero may make the engineer's life easier.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: