This is an incredible exciting development. One thing I would like to point out is that the existence and progress of this PR is showing that there has been a change in approach in Rust over the last year or so which wasn't there previously. A lot of long standing issues were basically blocked on them just being too big to solve due to the surface area. Now the Rust project is more willing to slice problems down into smaller sets and giving those a try.
This ABI will be limited, but despite those limits it has a lot of utility.
On Linux, you don't need C at all since you can interface with the kernel directly. The ABI is stable and simple enough that a new programming language could have a system_call keyword that makes the compiler emit the system call code.
With this one piece of functionality, it's possible to do literally anything on Linux. No need for C libraries and their legacy at all. The hardest part will be describing the Linux user space API data structures in the new language so that they can be passed to and from the kernel.
There are no such considerations when issuing Linux system calls. You place parameters in specific registers, trap into kernel mode and then read back the result from a specific register.
There is one MIPS architecture where you have to pass some arguments on the stack but that's about it.
> vdso are C-ABI objects
Yes, unfortunately. It is not necessary to use the vDSO though. It is a performance optimization. The normal system calls will work just fine.
Well, alright, but then you still need to parse C to be able to divine the shape of the data structures, as they're defined in C (or you have to make some assumptions).
Yes. Fortunately, these definitions are significantly less complex than libc stuff. They all use typedefs prefixed with __kernel that are defined in asm and asm-generic headers.
The Linux kernel headers with all relevant structure definitions are here:
TBF, the whole point of any ABI is that you don't need a specific programming language to call into functions implementing that ABI, could even use assembly.
This is huge news!! I've been waiting for something like this from the Rust team for years! I hope libraries can be built for other languages that would allow cleaner interfacing directly with Rust that doesn't rely on the boat anchor of the legacy C ABI that all languages are tied to!
This is the most important "feature" of Rust for it's long term success.
Edit: It looks like this is still using C ABI as it's base, which increases the burden for other languages to implement it. What a let down...
> It looks like this is still using C ABI as it's base, which increases the burden for other languages to implement it. What a let down...
How so? It seems to me that actually decreases the burden. For example C# people may not implement the Rust ABI if they don't want to, but since they (and basically every language) already support C ABI, I can implement a library over the existing C ABI so that I can access Rust types.
Correct, the entire goal of defining it in terms of the C ABI is that C is the lingua franca that every language already speaks, so anything built on top of that has a dramatically lowered barrier to entry.
I assume that the grandparent poster was thinking about making it possible for hypothetical future languages to someday be able to forego supporting the C ABI entirely, which is certainly a laudable goal (there's plenty of historical baggage there), but what the Rust developers care more about here is simply being able to do FFI that's safer than what currently exists.
Because if you're implementing it properly with the advanced features of a language such as Rust, you need to similarly create an entirely different ABI that goes down to the C ABI, increasing the complexity of the implementation because the C ABI has so many edge cases.
Basing it on the C ABI makes it only easier for "simple" languages like C to communicate with Rust.
> increasing the complexity of the implementation because the C ABI has so many edge cases.
Every language speaks C, and whether we like it or not every new language will continue to speak C for the foreseeable (for-C-able?) future. This cost is already baked in, so you might as well leverage it to reduce the cost of a second ABI by building it on top of the ABI you're already guaranteed to support.
What edge cases are there in the C ABI, and what can't you represent with it?
The big problem to me is sharing pointers across FFI which is extremely dangerous in managed languages where the pointer could be invalidated by a GC sweep, but I don't think that's terribly relevant to Rust.
They inherit a lot of issues from C like badly defined types; they imply on some unfit conventions like \0 terminated strings; they do have some issues defining the memory managing model (but I think that's unavoidable).
I have never dwelled on low-level compatibility for long, so I certainly don't know about most issues. But even with a very small view, cleaning the leggacy looks like a very worthwhile thing to do. Maybe after Rust has gained more adoption, we can start to make C adhere to it, instead of the other way around.
The C ABI is (broadly speaking) a standard way to layout structs and a choice of calling convention for the platform. That doesn't influence which types are/are-not defined (and the superset of a C compatible ABI can fix some subtle issues there, like untagged unions).
Null terminated strings are not a part of the ABI, they're a convention in C libraries. Same goes for memory management, that's not really a part of the ABI (but it is an issue, since you need to define who is responsible for memory if it is shared across FFI, but again, the big issue there is garbage collection).
Any difficulty on understanding your types and discovering their sizes will make it harder to use the ABI. Any convention that you decide not to follow will break the interoperability of other software you may call over the ABI. And the C ABI doesn't carry information about memory management because people decided to build it that way, it could very well do it.
But if you want to define an "ABI" as strictly the geometry the data has on the memory when a function is called... well, good luck using that information on the other end to retrieve the data on your function; it's completely useless. But yes, I don't see any problem on it either.
The C ABI carries memory ownership information just fine - you simply pass a struct with two pointers, one for the memory, one for the function that does the cleanup.
Well, you need some standard convention on how functions parameters and return values map to registers and memory locations. Whether those conventions are called "C ABI" or not doesn't really matter, those ABI conventions are much closer to the CPU architecture than any specific programming language anyway.
IMHO yes, if you squint a bit, there's no such thing as a "C ABI", C compilers just happen to implement those CPU/OS specific ABIs without much 'translation work' going on.
This is interesting but seems very basic. It looks like it is just lowering types that were previously too complex into C ABI.
Notably it doesn't solve anything about versioning. I honestly quite like the Swift approach. It isn't zero overhead but it defaults to really good comparability. Basically it adds a vtable for everything including member offsets and object sizes.
But the really nice thing is that you can tell the compiler when you don't need compatibility (code and types in the same library, or a library that you statically link) and the overhead just disappears.
Of course you lose a few language features (you can't know the size of a type) but then they have tags that you can apply to trade compatibility/flexibility to get those features back.
That just pushes the problem onto the user. There is lots of well-documented problems with C versioning. Just saying "don't change the size, order, alignment or layout of anything ever" isn't a great option, it can be done (Microsoft has done a pretty good job) but it requires infinite foresight and is very expensive.
Swift allows libraries to naturally evolve while maintaining comparability. Most things that you would expect to work (add a new field) just work. Good luck adding a new field to `jmp_buf`. It's effectively impossible to do in an ABI compatible way. The fact that C's ABI is stable/backwards compatible doesn't help.
Basically a backwards compatible ABI is the absolute minimum requirement for making backwards compatible libraries. Having more options removes heaps of complexity for library authors.
For those unaware, the abi_stable crate makes stable ABIs (even with complex features like trait objects) pretty easy and, importantly, verifiable. It is primarily useful for rust-to-rust abi stability (for instance when creating a plugin system).
I think it would be amazing if there was also a way to export a machine readable description of the layout of an interoperable object. Then other languages could parse it instead of needing to parse Rust code and know all the ABI rules.
This is typically called an IDL and TLA mentions inspiration from several examples under "prior art." At the top is the Web Assembly Interface proposal which has the wit [0] format,
Yes, to some degree. You can dynamically link Rust today using the C ABI, but then you lose all the richness and safety of Rust types. The interoperable ABI will allow more of Rust's type system to work across the boundary.
If it's a simple enough binary interface, sure. Simplicity is really important.
C++ has standardized ABIs but you just don't see people dynamically loading a C++ shared object, obtaining a pointer to a C++ object and calling its member functions. They always do it through the C ABI if they do it at all. Some would rather rewrite stuff in C than put up with that. C++ ABIs are so obnoxiously complex the only things that dare to touch it are C++ compilers and even they have been known to break things. I just get this incredibly hopeless feeling whenever I try to contemplate how I'd handle a C++ exception from Python code. The only reasonable answer I can think of is I wouldn't, I'd just disable C++ exceptions instead and hope for the best.
People use C because it's just simple symbols and unchanging calling conventions. Look up symbols to get function pointers, put parameters in specific registers, call the function and off you go. Nobody will interface with Rust if the ABI is too complex. Given the language's many high level abstractions, I think it's likely we'll see a repeat of the C++ situation above.
> Given the language's many high level abstractions, I think it's likely we'll see a repeat of the C++ situation above.
The goals in the OP seem to suggest that the objective is not to have a bespoke ABI that encapsulates the entire Rust language, but rather to build a relatively minor shim on top of the existing C ABI that supports some useful Rust patterns (e.g. Option, Result) which should be possible to express straightforwardly in many languages. If this effort succeeds then I can absolutely see it supplanting the C ABI for at least the dynamically-loaded-Rust-from-Rust use case.
> C++ has standardized ABIs but you just don't see people dynamically loading a C++ shared object, obtaining a pointer to a C++ object and calling its member functions. They always do it through the C ABI if they do it at all.
That does not match my experience at all. From the top of my head, Qt (QtPlugin) does C++ ABI plugins, and they do that almost transparently in a myriad of platforms.
> I just get this incredibly hopeless feeling whenever I try to contemplate how I'd handle a C++ exception from Python code.
At one point I could even catch C++ exceptions from Java, and viceversa (see gcj).
As far as I'm aware, the Qt DLLs themselves just expose C++ class APIs directly without going through a C API wrapper, that's why there's a separate Qt SDK for every MSVC version (and all hell breaks loose if you accidentally mismatch DLLs).
Yes, Qt itself is a C++ library. AshamedCaptain was talking about the plugin interface though which is just dynamically loadable modules for Qt applications. I looked it up and those things export C functions that return the C++ objects and meta-object system data.
Only in the sense that C++ itself "uses the C ABI". This is still C++ ABI; most if not all of the plugin types inherit from C++ classes and there are no code wrappers generated whatsoever (except the usual moc slot unmarshallers if you want to be really picky, but these are not going to be ones you'll be calling). These macros are there for discovery and enumeration/reflection. After you have constructed the root plugin object, you are calling it through the C++ ABI, vtables included (since actually you call it through the base class).
> I'm interested in the details of that. Perhaps the GCJ compiler generated some sort of bridge between the languages?
The C++ exception/"personality" ABI is kind of designed for this, which is one of the reasons it may appear complex and/or slow. Just search for gcj exceptions for examples.
> Only in the sense that C++ itself "uses the C ABI"
Not at all. Qt literally uses the C ABI as a protocol for plugins. I have no doubt they chose it because it's the only inferface simple and stable enough for the task. It is relatively simple and stable precisely because there are zero C++ features involved. I thought that alone was damning enough so I didn't elaborate further.
What you said is absolutely true: the objects returned by those C functions are accessed by the real C++ ABI with vtables and everything. That's a massive problem and they shouldn't have been designed this way. Look at the mess this causes:
> I know that C++ don't have Aplication Binary Interface - ABI compatibility between compilers (example: MSVC, MingW, Clang and ICC).
> Besides that, the C++ ABI compatibility can be broken even using the same compiler with different version (example: MSVC 2003 and MSVC 2019).
> Example: If the main C++ application was compiled using MSVC 2003, all plugin developers MUST HAVE and MUST USE the exactly same compiler to build your projects, in this case MSVC 2003. Which is not good.
> Looking in the internet I found a way to fix that: Creating a C Wrapper of my C++ Interface Abstract Class.
The solution is, of course, to get rid of all this C++ business and restrict yourself to the C ABI. This is apparently the fate of every ABI that isn't C. When even that is annoying enough to deal with, nobody is ever going to want to deal with anything more complex.
I don't understand this point. You say "Qt uses the C ABI for plugins", then immediately claim that "because they use the real C++ ABI with vtables and everything" it is a "massive problem" that shouldn't have been designed this way.
First, Qt is using the C++ ABI, even for dynamically loaded plugins. This much should be obvious: it is calling C++ methods through a C++ abstract base class, passing C++ objects as arguments, on which C++ methods are going to be invoked. How is this not the C++ ABI?
The one thing that may be confusing is that one uses C APIs (dl, etc.) for discovery, but this is because C++ _is_ using the C ABI. C++ symbols are in the platform executable format's symbol table, after all. How is one going to export and GetProcAddress a plugin object factory method in C++?
Second, it is not really a "massive problem": it simply works, and it is used by Qt itself for critical functionality such as the image format loaders (which I think people would rather notice if they were not working).
The usual MSVC++ stuff you mention applies for _all_ Windows based development (even C! you are not supposed to mix MSVCRT versions in Windows!), and specifically applies for _linking to Qt itself_. Why would it now become an issue for plugins, which obviously must all link to a compatible version of Qt? Ensuring one uses the same version of the MSVC compiler is not so different as ensuring the use of the same version of MSVC that was used to build Qt, or even to use the same version of Qt for the base and gui libraries...
There is really not that many more issues with the C++ ABI than with the C ABI; they are more visible because people tend to use & expose more of the C++ ABI. E.g. it is rather unlikely to pass FILE objects to libraries, while it is rather common to pass std::string objects. If you send to a module with a different ABI, you are going to crash (if you are lucky enough).
> How is one going to export and GetProcAddress a plugin object factory method in C++?
How, indeed. That's exactly my point. How could anyone possibly do that? You'd need to implement a C++ compiler frontend in your GetProcAddress and dlsym functions just to resolve C++ symbols. This is why they had to use the C ABI for what you call "discovery". The fact is even getting a simple pointer to a C++ object would be pretty much impossible without a C ABI in there.
To say nothing of actually calling the functions returned with C++ calling conventions. The only reason this QtPlugin business even works is you're also writing C++ code and can therefore use C++ calling conventions. Even that is a dance because those conventions are different for every compiler and even individual versions of the same compiler.
> Ensuring one uses the same version of the MSVC compiler is not so different as ensuring the use of the same version of MSVC that was used to build Qt, or even to use the same version of Qt for the base and gui libraries...
So your solution is to just make sure everyone is using the same compiler? Yeah, easier said than done. Even in the free software ecosystem which hates binary interfaces and has the capacity to rebuild the world if necessary, breaking ABIs cause chaos and pain for everyone.
> How, indeed. That's exactly my point. How could anyone possibly do that? You'd need to implement a C++ compiler frontend in your GetProcAddress and dlsym functions just to resolve C++ symbols
I still don't follow. C++ symbols are C symbols. Can you put an example of how you can get more C++ than Qt is? If you mean mangling, I don't think you need an entire C++ compiler for it. But even standard C++ contains some escape hatches to avoid mangling (one of them being 'extern "C"').
> The only reason this QtPlugin business even works is you're also writing C++ code and can therefore use C++ calling conventions.
I don't disagree with that. My point was that dynamically loadable plugins in C++ -- with "vtables and all" -- not only work, but are in fact frequently used.
> Even that is a dance because those conventions are different for every compiler and even individual versions of the same compiler.
> So your solution is to just make sure everyone is using the same compiler?
This is an entirely different problem that affects _all_ languages. On Windows, as I mentioned, there are even multiple C ABIs (incl multiple calling conventions) and standard libraries and everything you can think of. This is done intentionally, so that they can more easily preserve binary compatibility with older versions. Other platforms have made different compatibility decisions.
I think people significantly exaggerate how much "pain" the situation causes (and this comes from someone who has distributed commercial multi-platform C++ software for 20+ years). In any case this is a common problem, dynamic plugins or not, C++ or C. And definitely, this is not something so annoying I would trade dynamic linking for. But that's off topic.
> but you just don't see people dynamically loading a C++ shared object
...that's exactly how Autodesk Maya plugin DLLs work, and it's a PITA, because DLLs need to be recompiled with a specific MSVC version for each new Maya version. Qt DLLs on Windows are also separate for each MSVC version.
> C++ has standardized ABIs but you just don't see people dynamically loading a C++ shared object, obtaining a pointer to a C++ object and calling its member functions
... Sorry what? This is exactly how every c++ plugin system I've worked with works. There are even some industry standards that work like this, e.g. VST3: https://steinbergmedia.github.io/vst3_dev_portal/pages/Techn... ; there are dozens of companies that provide hosts for such plugins and thousands that provide the actual plugins, all built with different toolchains, compilers etc, so obviously it works
You can absolutely link to all of those things if you have a compiler toolchain to do it. Can you dynamically load Qt's C++ DLLs from an arbitrary scripting language and use them via foreign function interfaces though? I don't think so. I remember exploring even Boost Python source code and finding extern "C" deep in its bowels so I seriously doubt people are doing this.
> Being 100% compatible with C++ means more or less adding a fully functional C++ compiler front end to D.
> making a D compiler with such capability unimplementable
> the solutions have been:
> Support the COM interface (but that only works for Windows).
> Laboriously construct a C wrapper around the C++ code.
> Use an automated tool such as SWIG to construct a C wrapper.
> Reimplement the C++ code in the other language.
> Give up.
The pragmatic approach mentioned apparently consists of matching the D and C++ ABIs as much as possible just to make the problem tractable. Even after that impressive effort, the foreign interface is basic and incomplete. Special member functions like constructors and destructors are not supported, for instance. Notably, there is no exceptions support!
> Swift is in the process of getting one
Swift had to embed a literal C++ compiler inside itself in order to even make such a proposition viable.
ABI stability between C++ libraries and applications built with the same compiler, sure. In other words, the only code that touches C++ code is other C++ code.
I'd like to emphasize that these tests were simple function calls between C code generated by C compilers. I can only imagine the hell that might be unleashed if similar tests for C++ ABIs are created.
> clang and gcc can’t even agree on the ABI of __int128 on x64 linux. That type is a gcc extension but it’s also explicitly defined and specified by the AMD64 SysV ABI in one of those nice human-readable PDFs!
Yikes! So it is supposedly defined in the SysV ABI, and at least one of the compilers fails to implement that. Are there bug reports for this? How does the disagrement manifest? Different calling concention when used as a function parameter or return value? Different alignment? I doubt that it would have different size or object representation.
> The interoperable ABI does not aim to support the full richness of Rust's type
system, or that of other languages. It aims to support common cases more safely
and simply.
So not much (or any) more than it currently is: it's not a stable Rust ABI, it's instead a more expressive ABI compared to C.
That said, even though it won't be "a stable Rust ABI", it will (if the experiment succeeds) be a stable ABI that is more Rust-like than the current C ABI.
As far as I understand the proposal, it's "just" standardizing how some high level Rust types would be automatically translated into a "C ABI" compatible representation, sort of like a builtin bindings generator.
But I wonder why DLL linkage wouldn't already be possible without this proposal.
C++ doesn't have a standardized ABI either, but it's possible to load shared libraries with C++ interfaces just fine - as long as they've been created with the same compiler and compiler version. Since there's only a single Rust compiler vendor I'd imagine that dynamic linkage should already be much less trouble than in C++ land?
Rust moves much faster than C++, at least at the moment. So the practical benefit of sharing libraries created with the same compiler version would be much smaller.
Yes. Note that they are intending to build it on top of the C ABI that already exists. So anything you can do with the new ABI, you can already do today. It just currently takes a lot of work. Every &[u8] slice you want to pass has to be rewritten as two arguments for pointer and length. That’s the easy stuff; error handling is as painful as it always is in C. You have to sit down and essentially bang out a C header file as Rust extern “C” functions, using error codes, fancy ways to represent error conditions as invalid return values (eg negative numbers) and all those tricks. Even if your target language is eg Swift and Swift can absolutely be taught to understand slice types, result types, nullable types etc, since it has all of these things natively. And then you have to write an actual C header file or use cbindgen, and then do the whole thing in reverse to climb back up to the high level types in the target language. As I said: a lot of work.
The general idea is for someone to eventually teach Swift about the slice type etc. When Swift defines the same mappings from its high level types (Optional etc) to C ABI representation as Rust does. And when that work is done you don’t need to drop down to banging out C headers and writing the repetitive conversion code on either end. All that would remain is keeping the “extern” declarations/imports in sync with the exports, which will be much easier once higher level types can be used easily in FFIs. Making an IDL to describe them is out of scope for now but clearly a possibility.
> You have to sit down and essentially bang out a C header file as Rust extern “C” functions, using error codes, fancy ways to represent error conditions as invalid return values (eg negative numbers) and all those tricks.
You have to do these things anyway if you want a Rust library to be accessible via C FFI, which is the preferred extern interface wrt. most programming languages. (Swift is an exception, since it was specifically designed to have a stable extern ABI).
It primarily helps making dynamically loadable Rust libraries that Rust can load. Everything else is downstream from there. So for instance if you have a web server written in Rust and you want to dynamically load a plugin also written in Rust.
No COM is a ref-counted land of objects with virtual tables of functions. This proposal seems to define how to convert C ABI to/from Rust types in a standard way. This will likely improve Rust+C use cases over time.
This ABI will be limited, but despite those limits it has a lot of utility.