It's important for core infrastructure to have multiple competing implementations. On a related note, does Rust have a standard yet or are they still doing the reference implementation thing?
> I was under the impression that llvm is better than gcc?
And I thought that tabs were better than spaces, BSD beat Linux, Emacs was the one true god... what were we arguing about again?
My impression is that gcc produces smaller code that is roughly comparable with clang in terms of runtime performance (with a slight advantage when compiling the codebase I care about the most). Gcc has better debug info, especially when optimizing. I don't know about compile speeds. Clang has better infrastructure for writing static analysis tools. Clang is a much more realistic alternative to msvc on windows than gcc is. I don't know about their development velocities. Clang seems to have the edge in mindshare.
I'd be curious to know whether this would provide cross-language outlining during LTO using gcc. I believe some form of this is possible with llvm?
My understanding is that LTO (and thus any cross-language inlining) takes place using a low level IR where language barriers aren't relevant. The GCC and LLVM backends both have full support as far as I know. Hypothetically it's simple to implement support in a given frontend, but apparently it proved to be a bit tricky in practice for Rust (http://blog.llvm.org/2019/09/closing-gap-cross-language-lto-...).
I don't think Rust is even defined by a reference implementation, given that they release a new compiler every six weeks.
For many practical purposes I think the closest thing to a language definition is the set of testsuites visible to Crater.
(That is: when the compiler people are considering a change, they don't say "we can't change this because we're past 1.0 and the change is technically backwards-incompatible", or "we can't change this because the Reference specifies the current behaviour"; they say "let's do a Crater run and see if anything breaks".)
This is not correct. We do use crater to help with questionable cases, but we often say “we can’t change this because we’re past 1.0 and the change is backwards incompatible.”
Which is still not the same as having a spec or even a reference implementation.
It’s a bit weird how laser-focused the Rust community is on backwards compatibility, not seeming to believe that forward compatibility is also important.
e.g., if I write code targeting C++17, I can be reasonably sure it compiles with an older version of the compiler, as long as that version also claims to support C++17, modulo bugs. Not the case if I write code targeting Rust 2015 as they’re still adding features to that from time to time. Let alone Rust 2018 which changes every 6 weeks.
Will there ever be a version of Rust that the community agrees “OK, this language definition is not changing unless we find some major soundness bug” ?
This is a big blocker for mainstream adoption in Linux distributions since the maintainer wants to be able to land one specific version of rustc in the repositories, not rely on people downloading new versions with rustup continuously. But old versions of rustc are effectively useless due to the lack of forward compatibility guarantees.
It's funny you cite C++, which has the best example of forward compatibility breakage in terms of impacting people.
g++ 4.4 implemented several key parts of C++11, including notably rvalue references, and adapted libstdc++ to use rvalue references in C++11 mode. However, the committee had to make major revisions to rvalue references subsequent to this implementation, to the point that you can't use libstdc++ 4.4's header files (such as <vector>) with a compliant C++11 compiler. So when you try to use newer clang (which prefers to use system libstdc++ for ABI reasons) on systems with 4.4 installed (because conservative IT departments), the result is lots and lots of pain.
Furthermore, it absolutely is the case that newer versions of compilers will interpret old language standards differently than older versions of the compiler. You don't notice it for the most part because the changes tend to revolve around pretty obscure language wordings involving combinations of features that most people won't hit. Compilers are going to try hard not to care about language versions past the frontend of the compiler--if the specification change lies entirely in the middle or backend, then that change is likely to be retroactively applied to past versions because otherwise the plumbing necessary is too difficult.
g++ 4.4 came out before the C++11 standard was released. So I’m not sure how it’s a counterexample. C++11 was a standard under active development and so obviously you can expect changes; its status at that time was comparable to rust nightly.
It's not that Rust community ignores forward compatibility, it's just that right now is not the right time. Things recently landing on Rust is not new stuffs. Most of them were designed in, like, 2017. It just took years to stabilize.
When is the right time, then? I would have thought the move to the 2018 Edition should have been the perfect time to declare the 2015 Edition as stable and unchanging, no? But it is still receiving major changes like non-lexical lifetimes.
Editions are not designed to be unchanging snapshots of the Rust compiler at a specific moment in time. By design, all Rust editions share the same middle-end and backend, and only differ in the frontiest parts of the frontend. The idea is that people should be able to update their version of the compiler without being required to update code that compiled prior to the 2018 edition.
Yes, I understand that. I'm saying I don't understand why the Rust community has made that choice, since it essentially makes old versions of the compiler useless and therefore makes it difficult for software written in Rust to be part of a typical Linux distribution.
(And has a number of other disadvantages too, like constant cognitive load having to re-learn the language every 6 weeks).
Also, editions could be a snapshot of the language definition at a point in time, without being a snapshot of the compiler. There are still new versions of Clang and GCC coming out with new bugfixes, better optimizations, improved error messages, support for different hardware, WIP support for future language editions, etc., without changing the C++17 standard.
> I don't understand why the Rust community has made that choice, since it essentially makes old versions of the compiler useless
I mention the reason in my prior comment: to allow people to continually upgrade their compiler version without needing to change any code. Rust doesn't have a stable ABI, so all crates in a Rust project ultimately need to be built with the same compiler (and furthermore, crates must always be able to interoperate regardless of which edition they're on). That means that every new version of the compiler needs to support every old edition, because the alternative is to have users stuck on old versions of not just the compiler but also on old versions of dependencies that have since begun using features only supported by newer compilers. In Rust's case avoiding such a fundamental fracture in the community was more important, since, after all, there's still nothing stopping anyone from voluntarily sticking with an older version of the compiler if they're willing to deliberately endure such a situation.
> (And has a number of other disadvantages too, like constant cognitive load having to re-learn the language every 6 weeks).
This is quite hyperbolic. Rust introduces no more features than any other language, it simply rolls them out on a more fine-grained schedule. Furthermore, Rust hardly requires re-learning every six weeks; a "feature" introduced by a new version is often nothing more than a new convenience method in the standard library. The fact that we have established that Rust goes out of its way even to keep "old" code compiling and compatible with the rest of the ecosystem should demonstrate how little it demands that users re-learn anything.
How come, given that currently C++ has three years to release a new standard and then about the same until the latest release gets spread around all major compilers?
Right now with C++20 around the corner, C++14 is still the safest bet for portable code, whereas in Rust we still see relevant crates that depend on nightly.
Here is the list of all the major features added to Rust since 1.0:
* The new lifetime model (non-lexical lifetimes)
* Async fn
* Procedural macros
* ? operator
* Import name resolution changes
* impl Trait
* C-style unions
Here's the list of features added in C++17 and C++20:
* constexpr if
* Modules
* Structured binding
* Type deduction helpers
* Coroutines
* <=> operator
* Concepts
* Expanded the set of expressions and statements that qualify as constexpr to the point that it's a very different feature from what it was in C++11.
In the same amount of time, C++ has added roughly the same number of features, but I would qualitatively say that C++'s feature additions are more impactful than Rust's feature additions, especially in terms of making newer code unrecognizable to programmers used only to the old version.
That's what I meant by pace versus cadence--overall, C++ has changed more, but it tends to change in triennial bursts instead of every six weeks.
I write standard compliant C++ code that only works with the latest compiler versions because older ones incorrectly claim support for C++ standard, but their implementation is too buggy.
Which is the main reason why no usable alternative implementations exist yet and why Rust hasn't found its way to more low-level software projects like the Linux plumberland yet.
Rust dearly needs a stable specification, it is the main blocker why the language hasn't been more widely adopted.
I agree. That, and no need to rush new features, stabilize old ones and fix their bugs. I am still waiting for RFC0066[1] to be fixed. It is from 2014-05-04! Here is its GitHub Issue: https://github.com/rust-lang/rust/issues/15023. Backstory: I started writing a relatively simple Rust program many years ago when I ran into this issue. It was my first attempt at writing Rust, and I did not like the workaround.
Rust has LTS like releases based on years, but the last one didnt have async await so everyone just kept tracking. I think the nexts LTS type release may snag some people and slow stuff down. Cargo should handle that fine, but not sure about maintainers. I cant get cargo in my day job, just the rustc 1.3x shipped with rhel, so I'm already standing still. Rhel's version migh make a good defacto lts in the absence of cargo, but the thin std lib makes that hard, and random old rustc versions sometimes don't play nice with every crate.
Rust doesn't have LTS releases. Editions are both a way to market the new features introduced in the past years as a "bundle" to outside users, and a way for us to introduce some breaking changes for the users opting into the new edition. The release an edition is stabilized in (1.31 for Rust 2018) does not get any special treatment from the Rust team though, and that includes no LTS support (for example we won't backport bug fixes, even security ones).
Yikes! I had no idea. It takes my company years sometimes to approve a new point release of software. Rhel subscription was my backdoor to get rust at work. I knew I'd never get a cargo access or a mirror approved, but i thought some day I'd push for an lts release. I don't see my company ever doing sw approval and transfer to our development network faster than once a year.
As I pointed out in another comment, the definition of the 2015 edition is still changing (i.e., features from 2018 are getting backported to 2015), severely limiting the usefulness of the "edition" concept.
E.g., if someone thinks "I'm going to target 2015 because I want my code to run on the rustc shipped with various slow-moving Linux distros", it doesn't help, because you might still not be able to target their code, unless they specifically target an older version of rustc, which nobody does.
Editions solve an entirely separate problem, they were never meant to be LTS language snapshots. For example, C++ is considering adopting them in addition to their current versioning scheme: http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p1881r0....
There has been discussion of a Rust LTS channel alongside stable/beta/nightly, which would try to solve that problem, but it has not been prioritized yet: https://github.com/rust-lang/rfcs/pull/2483
An actual frozen language is also a possibility, but probably won't happen until more work happens on an independent specification. Which, in fact, people are also working on: https://ferrous-systems.com/blog/sealed-rust-the-pitch/
I would say that rust doesn’t have a formal specification, what it does have is close to or better than many languages “specs” with the “Rust Reference”: https://doc.rust-lang.org/stable/reference/
It really depends on how strictly you define the term specification. The Rust Reference is not required to be accurate. Though many other language compilers/implementations don’t fully implement their respective specs so, :shrug:.
The Rust Reference is very far from being complete, or even correct in what it does cover.
If I have a question whose answer isn't obvious, it's far more likely that I have to go trawling around in RFCs than that there's an answer in the reference.
I think most languages of a similar age (eg Go, Swift) are doing better.
Rust is younger than Go (released in 2015 vs 2012) and way more ambitious, especially Rust 1.0 was released as kind of an MVP and many things have changed since then, which made the maintenance of such a reference an issue. The pace of change is slowing nowadays (that's especially visible if you look new and accepted RFCs), so I hope the reference will catch up eventually.
There are some people studying Rust with formal verification. For example in this paper https://plv.mpi-sws.org/rustbelt/rbrlx/paper.pdf However I do not know if the whole language is covered or only a core.
Llvm is better than gcc at modularity and extensibility (or at least it was when llvm was released, I haven't followed gcc evolutions in a while). People who work on new languages typically use llvm because it's designed to make such things simple.
Now, in terms of end results, llvm and gcc each have their qualities. When llvm was released, gcc typically produced faster binaries but llvm optimizations were easier to understand. Since then, both have evolved and I haven't tried to catch up.
Bottom line: having two back-ends for rust or any other language is good. Among other things, it's a good way to check for bugs within an implementation of the compiler, it can be used to increase the chances of finding subtle bugs in safety-critical code, etc.
One thing GCC excels over LLVM is quality of debug information. If you switch from Clang to GCC, you will see less "optimized out" in GDB. This is pretty much guaranteed.
And here are plenty that support the converse ;) I don't think there's anything that points to one being definitively better than the other in performance.
GCC does some things better than LLVM. It supports more architectures, has a not broken implentation of restrict (which should be useful for Rust), and optimizes some code better. They both have their own pros and cons.
I'm actually surprised that Rust enabled noalias usage with this known outstanding issue. When I worked on Rust years ago, it was definitely common knowledge on the compiler "team" that this was broken.
I'm equally surprised that GCC had that bug, since their pointer aliasing model is equipped to correctly handle this situation (and is why they were able to fix it quickly).
Eh, it’s not as one-sided as that. GCC has a larger number of targets, but LLVM supports several newer targets that GCC doesn’t, like WebAssembly and eBPF (although the latter is coming in GCC 10). But it would certainly be nice for Rust to support both sets of targets.
In theory, both GCC and LLVM take a front-end (in this case rust) and compile it down to an intermediate representation (IR). There will likely be some differences between the output from a front-end, but after successive optimisations have been applied this will likely disappear. By the time you get to generating assembly, you can't really tell the difference anymore so the semantics of the original language don't make an impact.
I'm sure there are a number of "reasonable" assumptions that aren't true–probably things like the number of bits in a byte, or the size of a particular integral type, or support for a particular platform behavior.
> The C standard uses the term byte to mean the minimum addressable unit in the implementation, which is char, which means a byte on these targets is 16 bits. This is in conflict with the widespread use of byte to mean 8 bits exactly. This is an unfortunate disagreement between C terminology and widespread industry terminology that TI can't do anything about.
Absolutely not. A byte is the smallest block of memory with an address. E.g you can't take the address of 7 combined bits on x86 but you can for 8.
In the past, architectures differed wildly in number of bits per byte, e.g 36 for the machine where the Pascal language was created.
Today, the industry mostly standardized on 8 bits per byte, but see e.g the PIC architecture for an example relevant today with a different choice: 8 bit bytes for data, but 10 bit bytes for instructions.
> A byte is the smallest block of memory with an address. E.g you can't take the address of 7 combined bits on x86 but you can for 8.
I think that's an anachronistic/incorrect usage. A lot of machines (including several with 36-bit words that you mentioned) supported larger basic addressable units of memory, but didn't call these larger units "bytes", and distinguished between "bytes" and "words". In fact, one of the elements of the early RISC philosophy was that CPU support for byte accesses (as opposed to word accesses) was extraneous, based on statistics gathered from real programs. Early MIPS/Alpha/etc. machines did not support byte addressing, but the people using them still called 8 bits a byte.
Arguably the first Alphas could have had a C compiler with 64 bit bytes but that would have made porting hard. Even then they were forced to add byte operations pretty early on.
Byte is also often defined as the smallest addressable unit in a computer. Which nowadays most commonly is 8 bit, to the point where you can generally assume it, but this was different in the past (6 and 9 bit being especially common alternatives) and is still in some niches like DSPs, which sometimes only can work on wider types. But at least those then are typically powers of two, which makes it easier for many tools.
> I've been under the impression that GCC still has much better hardware optimizations than LLVM has.
That is my experience too.
GCC for code with high level of nesting, meaning high potential for inlining (typically C++), is close to unbeatable. Including even compared Highly optimised compilers like Intel ICC.
GCC has a reputation of having confusing architecture. It is a very hard project to work on. LLVM is typically considered cleaner and more understandable. GCC is known to have still in 2019, a rather slight performance benefit.
LLVM also has a stable IR named LLVM itself, while GCC refuses to do so over the decades for political and strategic reasons.
> while GCC refuses to do so over the decades for political and strategic reasons.
That was a long time ago. Since GCC 4.5 (released in 2010) GCC supports external plugins. [3,4] These plugins, like the rest of GCC, use GENERIC and GIMPLE as their IR.
Having worked with both, I don't know what you mean by "confusing architecture". Both are OK to work with, but both have some glaring holes in their documentation. LLVM's data structures are typically nicer to use than GCC's linked lists in a lot of places, that much is true.