This. I don't think enough people appreciate how the simple and clear the JVM byte code's design is.
Most platforms in use today are so complex. Take a look at x86, or one of the latest ECMAScript specifications. Even LLVM bitcode is a bit complicated compared to the JVM bytecode.
I think in programming language research, going forward, we need some research into "high-level bytecodes". I.e. bytecodes that capture high-level concepts in a clear and simple way.
Those bytecodes happened because the JVM bytecode was designed to be easily interpreted, whereas CIL was designed for JIT compilation. So for example CIL's `add` opcode is missing info that needs to come from the context in which it is used and the JVM's `iadd` and variations are easier to interpret.
You can see this design choice even today in how the JVM and CLR work. The JVM starts execution in an interpreter mode, then gradually compiles pieces of code as it detects bottlenecks. So the compilation that happens at runtime is very gradual and based on runtime measurements.
The CLR on the other hand has done JIT compilation, with the ability to cache the compiled code for faster startup (e.g. Ngen). So it has been oriented towards ahead of time compilation.
Sure. I tried to find information yesterday when I posted that regarding whether / how the `add` code handles differing types. Now since I'm not on my phone, I looked up the ECMA spec and it looks like CIL still only allows like types, with some minor exceptions involving `native int`. So it's just up to the compiler to make sure that actually happens and extend types as necessary.
I also prefer the CIL and also the C# language to Java (though I really like Java and its ecosystem), but we have to admit that MS had 5-10 years to learn from the Java design decisions and their effects, and still did not manage to overcome every problematic point :)
Although it should be mentioned that java language semantics are largely (depending on how you measure them :-) absent from the jvm. (Default methods were a very unusual change in that respect)
And, as you say, subsequent history has weirdly inverted the JVM and CIL. The former is a lot less 'J' and the latter is a lot less 'C' ;-)
I would bet that there's a lot higher rate of .NET users running VB.NET than there are JVM users running Scala/Clojure/Kotlin. They just don't tend to be the sort to post to Hacker News.
I think the bane of James Gosling's existence is being incorrectly associated with Java-the-lacklustre-language instead of correctly being associated with the superb JVM.
If there is a problem with java bytecode, then that is hard to verify. You need multiple passes over the bytecode, until you've reached a steady state.
There is also the "issue" that Java bytecode allows arbitrary control flows with goto, while Java doesn't.
IMHO WebAssembly solves that better but I also need to admit that they could already learn from Java.
> IMHO WebAssembly solves that better but I also need to admit that they could already learn from Java.
While that may be true of the language Java, WebAssembly's jump instructions are not without their annoyances too. For example, the JVM bytecode requires your stack to be precise when jumping, WebAssembly just cares about the most recent piece. If you expect your jump targets to have the same stack layout, WebAssembly makes the impl handle it. I had to account for this and other differences in my compiler [0].
Most platforms in use today are so complex. Take a look at x86, or one of the latest ECMAScript specifications. Even LLVM bitcode is a bit complicated compared to the JVM bytecode.
I think in programming language research, going forward, we need some research into "high-level bytecodes". I.e. bytecodes that capture high-level concepts in a clear and simple way.