> There is even undefined behavior if you mess up namespacing! (Undiagnosed ODR ...

Rusky · on Oct 20, 2020

It carefully designs its name mangling such that this doesn't happen in the first place, even in the presence of multiple slightly-different builds of a library.

The only way to get matching names is to ask for them explicitly, via FFI.

jcelerier · on Oct 20, 2020

> It carefully designs its name mangling such that this doesn't happen in the first place, even in the presence of multiple slightly-different builds of a library.

but, after checking apparently this adds a hash of the function to the name mangling - how does that work when you want to call it from another language ? e.g. for instance you can call C++ code directly through some dialects of Lisp, Perl , ADA, or D (AFAIR) as they all have libs or mechanisms that kinda understand C++ name mangling - how are you going to do the same with, from what I'm seeing, "name_of_the_file::name_of_the_function::some_hash" ?

> The only way to get matching names is to ask for them explicitly, via FFI.

and what happens if you have two libraries which expose the same extern-C function name ?

Rusky · on Oct 20, 2020

If you want to call Rust code from another language, you use its FFI tools to export an un-mangled API. This is typical for C++ as well- trying to interop with mangled C++ names requires a lot of coordination across the toolchains and so even the examples you cite don't work without a lot of pain.

If you do wind up exporting the same name twice, you just get a linker error, because Rust doesn't play the same games C++ does with linkage. (This is also true of C++ FFI- the problematic ODR-violation stuff tends to involve more complex language features than `extern "C"`.)

jcelerier · on Oct 20, 2020

> If you want to call Rust code from another language, you use its FFI tools to export an un-mangled API.

this does not answer the question of whether the behaviour is defined if multiple libraries export the same name (which is the original question). See my other comment, what happens if from rust code you dlopen libbar.so ?

Rusky · on Oct 20, 2020

The behavior in that case is defined by the implementation of dlopen. This is entirely outside of Rust's control, but fortunately it's also perfectly well-defined by the platform. Again, does not intersect with the ODR violations I mentioned originally.

jcelerier · on Oct 20, 2020

> but fortunately it's also perfectly well-defined by the platform.

that's the same for every language and thus not very relevant. if you have single, static binaries / libraries of course everything is simple, and you'll get linker errors in C++ just like you would in Rust. What is not simple is when you start loading twelve dozen libs at load-time or run-time and it does not seem that Rust defines behaviour any more than C++ in that case.

Rusky · on Oct 20, 2020

Right, I've been telling you it's not relevant for the past three comments now.

You seem to be under the impression that dlopen somehow interacts with language undefined behavior; it does not in either Rust or C++.

jcelerier · on Oct 21, 2020

> Right, I've been telling you it's not relevant for the past three comments now.

but it is ! the only reason why ODR is UB in C++ is because the C++ language authors can't force the system linkers (again, whether at link time, load time or runtime, I'm not only referring to dlopen) to perform LTO which trivially makes ODR violations a diagnosticable error.

But as far as I know, neither can the Rust language authors do so - so either the behaviour in Rust is as defined as in C++, or Rust does not support creating standard platform object files that are linked by ld, gold, or whatever (such as D, ADA, Fortran etc all support) which would make a fair amount of use cases impossible - it's pretty common in some HPC circles to link C++ and Fortran directly in the same executable for instance. And then, of course it's easier to define behaviour when you use a reduced set of constraints, but it definitely does not makes something worth bragging about.

Rusky · on Oct 21, 2020

Going back to my original answer, the difference lies in what the two languages ask of the linker under normal circumstances.

Normal, non-FFI-using C++ can hit ODR violations in response to things like typos or subtle mis-uses of `inline` and templates.

Normal, non-FFI-using Rust is designed such that these situations never come up.

My original comment was never talking about FFI in the first place, where yes, both languages are much more at the mercy of what the platform provides. However, in that case the spooky UB ODR violations I was referring to are also not relevant, because you just get normal, fully-defined platform behavior- the expectations of the compiler (and thus the chances for them to be violated, resulting in UB), are different.

jcelerier · on Oct 20, 2020

carefully designed LOL what kind of joke is this

    $ echo "pub fn my_function() -> i32 { 0 }" > bar.rs ; rustc --crate-type=dylib bar.rs && nm -A libbar.so| grep my_fun
    _ZN3bar11my_function17hce21faeb92ac13c6E

    $ echo "pub fn my_function() -> i32 { 1 }" > bar.rs ; rustc --crate-type=dylib bar.rs && nm -A libbar.so| grep my_fun
    _ZN3bar11my_function17hce21faeb92ac13c6E

jeremysalwen · on Oct 20, 2020

You missed the c++filt at the end:

$ echo "pub fn my_function() -> i32 { 1 }" > bar.rs ; rustc --crate-type=dylib bar.rs && nm -A libbar.so| grep my_fun | c++filt

libbar.so:0000000000047230 T bar::my_function::h4ed6ea856a52cd6b

So adding a single hash to the end of the symbol is a joke?

jcelerier · on Oct 20, 2020

I don't think that dlsym would accept "bar::my_function::h4ed6ea856a52cd6b" as a symbol, just like it would not accept "foo(std::vector<int, std::allocator<int> >&)" and wants "_Z3fooRSt6vectorIiSaIiEE" instead, no ?

To be clear, my comment was about the fact that two different functions produce exactly the same symbol name, c++filt or not, which is not what I was told above in " It carefully designs its name mangling such that this doesn't happen in the first place, even in the presence of multiple slightly-different builds of a library."

I have no particular comments on the idea of using hash though I believe that something that changes 95% of chance error in 0.5% of error (I'd assume, as it took me 10 seconds to find a collision) is very bad - you want errors consistently when you fuck up, not once every hash collision as it sounds like a really really big pain to debug when it happens.

jeremysalwen · on Oct 20, 2020

I see, I misunderstood your comment as about the mangling convention, not about the collision.