I took a cursory glance at the type inference code to see some of their code style and to check the error message style. The error message look well formatted for the type inference phase and the algorithm looks to be very capable. The code is dense, but I think digging into this implementation is going to be my weekend project. Congratulations to the DaScript devs for getting this out there.
New scripting language which claims to be fast and includes LuaJIT as a comparison in the benchmarks? That is a real test.
To do a surface level bikeshed: not a huge fan of whitespace or braces. Needless flexibility in the language when they could just pick one. If there is an eventual code formatter, I assume the decision will be made then.
Here is a concrete need for the flexibility: you can have clean pythonic view by default, but then improve readability with a denser design if you want
if (n < 2) { return n
else { return fibR(n - 1) + fibR(n - 2)}
That only works if you're manually formatting the code, hard to imagine a deterministic rule for deciding when to compress, that won't disappoint people rather often.
It's a bit too obscure. I can't imagine when I'd ever manually format.
That's the beauty of flexibility - for those cases where beauty matters, it's reachable, but not forced upon anyone else
I don't even find it that hard to imagine (I can even imagine multiple seamlessly integrated AI-suggested options generated based on your previously handcrafted examples), but agree with your underlying point that that's likely not going to happen given the state of tooling design across the board
Once someone uses a flexible feature, everyone else who works on the codebase has to deal with it too. Anyone who thinks of code as more of an implementation detail than a goal probably won't like not being able to just click auto format and forget it.
You can resolve it by choosing a standard style for the whole project and letting auto format do the rest.
Maybe even auto rejecting commits that don't match it, like some projects do.
You could of course do the same with a flexible language, but without manual formatting at least some part there's probably not going to be as much benefit.
I guess you could say "auto format leaves braces alone" but then you don't have a fully machine executable style.
Maybe you could have the auto format just always compact the braces, but that seems like encouraging dense one liners, most projects probably want to discourage that.
You can't resolve it because it's manual formatting that's not part of the auto format rules, that's the point! And this doesn't depend on braces vs whitespace, manual improvements are possible in either case, so your argument against {} vs space flexibility doesn't have the benefit of forcing machine executable style
> As an interpreted language, daScript typically outperforms everything, including the blazing fast LuaJIT
According to https://dascript.org/ when calculating the geomean of fibonacci, primes, particles and n-body daScript interpreter claims to be 3.5 times faster than LuaJIT and 11.7 times faster than the PUC Lua VM.
Did someone verify this? What's the trick? How can daScript be faster than the tracing LuaJIT (or even than the LuaJIT interpreter, if that one was used instead of the JIT)?
Actually, when I first saw this, I set out to investigate. I managed to reproduce the same performance in an isolated environment and I has been sitting on a post on how this works for quite a time. Hopefully, I will have time to finish it this weekend.
Yes, it is that fast.
Long story short, the trick is a completely novel way of implementing interpreters.
It is tree-based and I should stress that the tree is not an AST tree but a highly specialized tree-based representation produced from the AST tree. A node is basically equivalent to an "instruction" in a bytecode-based interpreter. Usual arguments against tree-based interpreters do not apply: there is no pointer-chasing, since the nodes sit tightly in the CPU cache. Dispatch overhead is just a virtual function call so should be comparable to a computed goto-based interpreter.
The nodes operate on machine-native primitive types and the values are passed around between nodes in registers most of the time - oftentimes the interpeter doesn't even need to access the main memory. There is no encoding or decoding of values and no typechecking during evaluation. The interpreter even goes at lengths to avoid conversion of values in most paths, i.e. if your code operates on uint32_ts, they will be passed around as such up until a node that requires them to be cast, effectively creating typed execution paths.
Before the interpreter there is also a "compilation" phase which performs a ton of optimizations, of which I guess the most interesting is the node fusion. As you may guess, it replaces smaller nodes with larger and more specialized ones. In my tests I implemented it for a naive AST-walker-like intepreter for a 3x performance improvement.
I can go on but I better finish the post.
I would also add that writing an interpreter in this style is far simpler than writing even a simplest JIT, but it does impose certain requirements on the language itself.
All in all, the design is nothing short of genius.
Interesting, thanks for the info. So as it seems the fact that no type checking/conversion is necessary as in dynamic languages has a positive influence on performance; but I would expect significantly less than factor two. The special AST therefore means that the interpreter is closer to a bytecode than an AST interpreter and doesn't suffer from the drawbacks of an AST interpreter, but then assumingly is not repsonsible for the measured speed-up (but instead avoiding a speed-down). The mentioned optimizations must then be responsible for the speedup, but actually also LuaJIT has many optimizations.
That would be OK for me because daScript apparently supports static typing and Lua doesn't; it's ok to compare statically and dynamically typed languages, but I was surprised if that would really cause the factor 3.5 performance difference.
Whitespace semantic indentations and {} at the language level so that you can have a clean pythonic view without the risk of looks differing from the meaning and dense one-liners if you like? What an awesome design feature!!! Whish it were more widespread
A statically typed embeddable language with pointers - very rare.
I don't like some of the decisions (e.g. all pointers are nullable) but I wish there were more languages in the space. The only other ones I know of are AngelScript which is old and crusty, and Gluon which is unnecessarily functional IMO.
At least there's the option to do it properly, ie with matching parentheses or braces of some kind. But then what when you're editing someone else's code and they used whitespace - you're stuck trying to decipher the indent level and wrangling tabs and spaces and whatever.
Is python not one of the world's most-used programming languages?
Can you not write parts of C++ sans curly braces?
When using curly braces for scoping (which is what they really define, tbh), do lines of code within the scope not end up aligned LTR (or, RTL, depending on your language)?
Do modern IDEs/word processors not have the capability to replace ASCII-defined tabs with spaces and vice-versa?
Seems like having issues with spacing is a you problem; most people handle it just fine.
erm, no, perhaps most people who use Python, deal with it, because they have to. And it's about significant indentation/whitespace-defined-scope, not spaces versus tabs, though that can also cause havoc for these languages.
Sure Python is popular, like k-pop, but not necessarily because it's the best, and certainly not because of significant-whitespace, which I suspect even Guido regrets (eg in interviews, he hardly seems to defend this choice).
Yes, you often format your code anyway in a way that could be used to ascertain scopes, but to me, using an actual delimiter is like double-entry book-keeping, and easier than manual formatting; You write your code with delimiters. Your IDE formats accordingly (with style rules that you can choose BTW). If it looks weird, you messed up the delimiters and it's obvious. Not to mention copy/pasting code, moving blocks around, etc; all fraught or painful with significant indentation.
Form follows function, not the other way around, IMHO.
It is designed to provide high performance and serves as an embeddable 'scripting' language for C++ applications that require fast and reliable performance, such as games or back end/servers. Additionally, it functions effectively as a standalone programming language.