We have seen many languages cycle in popularity, but Julia is one of the few high-level languages that could actually match... or in some cases exceed C/C++ performance.
There are always tradeoffs, and it usually takes a few weeks for people to come to terms with why Julia is unique.
This is actually really easy. Most C/C++ code is pretty slow. Beating perfectly optimized C/C++ code by a notable margin is basically impossible (all relatively fast languages in the limit tend to converge to theoretical peak CPU performance), but real world code isn't perfectly optimized. The better question is on a performance vs effort graph who wins. Julia has a ton of major advantages here. The base language actually gives you fast implementations of common data structures (e.g. Dictionaries and BitSets) and BLAS/LAPACK wrappers to do linear algebra efficiently while still having your code look like math. The package manager makes it basically trivial to add packages for more complicated problems (no need to mess around with makefiles). the REPL makes it really easy to interactively tweak your algorithms and gives you easy ways to introspect the compilation process (@code_native and friends). Another major advantage is that Julia has macros that make it really easy to make local changes to a block of code's semantics that are compiler flags in C/C++. For example, consider `@fastmath`. In C/C++ you can only opt in to fastmath on a per-compilation unit level, so most projects that have one part that require IEEE handling of nonfinite numbers or require associativity in one part of the program will globally opt out of the non IEEE transforms. In julia, you just write `@fastmath` before a function (or for loop or single line) and you get the optimization.
All the other answers are true. But there is one thing I didn't see people saying. Thanks to the existence of macros, you can create->compile code in runtime. This allows for faster solving of some problems which are too dynamic, thanks to the fast compile times of Julia.
This might sound counterintuitive given that latency is a normal problem mentioned everywhere else about Julia. But, if you think about it, Julia compiled to native code a plot library from scratch in 15- seconds every time you imported it (before Julia 1.9 where native caching of code was introduced, and latency was cut down significantly).
This makes that problems where you would like to (for example) generate polynomials in runtime and evaluate then a billion times each, Julia can generate efficient code for ever polynomial, compile it and run it fast those billion times. C/C++/Fortran would have needed to write a (really fast) genetic function to evaluate polynomials, but this would have always (TM) been less efficient than code generated and optimised for them.
Edit: typos and added some remarks lacking originally
In general, parallelization was a messy kludge in older languages originally intended for single CPU machine contexts. Additionally, many modern languages inherited the same old library ecosystem issues with Simplified Wrapper and Interface Generator template code (Julia also offers similar support).
Only a few like Go ecosystem developers tended to take the time to refactor many useful core tools into clean parallelized versions in the native ecosystem, and to a lesser extent Julia devs seem to focus on similar goals due to the inherent ease of doing this correctly.
When one compares the complexity of a broadcast operator version of some function in Julia, and the amount of effort needed to achieve similar results in pure C/C++... the answer of where the efficiency gains arise should be self evident.
One could always embed a Julia programs inside a c wrapper if it makes you happier. =)
I do not dispute that C is not the fastest language. However C99 has the `restrict` keyword, which when combined with strict aliasing rules gives non-aliasing function arguments (I believe).
There are a few cases where it's easier to get LLVM to generate certain code I imagine. Semantic things like aliasing, in lining, and type information.
In general though it's just a question of which hoops you have to jump through for which language comparing C/C++/Julia/Fortran when using LLVM
If this wasn't their bread and butter and not an example they picked themselves, it would indeed not be fair... But they chose this example specifically and said this is how you get state of the art performance with Mojo, so...
that's cool but mojo literally just came out