That's a great point about allocation/memory management. As an example, rust-analyzer needs to free memory, but rustc's `free` is simply `std::process::exit`.
> I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.
I'm in strong agreement with you, but I will say: I've really grown to love query-based approaches to compiler-shaped problems. Makes some really tricky cache/state issues go away.
> Are they? I feel like intellisense is largely a subset of what a compiler already has to do.
They are distinct! Well, not just intellisense, but pretty much everything. I'll paraphrase this blog post, but the best way to think about think about the difference between a traditional compiler and an IDE is that compilers are top-down (e.g, you start compiling a program from a compilation unit's entrypoint, a `lib.rs` or `main.rs` in Rust), but IDEs are cursor-centric—they're trying to compile/analyze the minimal amount of code necessary to understand the program. After all, the best way to go fast is to avoid unnecessary work!
> If you wanted to argue that intellisense is a subset of compiling and it can be done faster and more efficiently I could buy that argument. But if you’re going to declare the tasks are at odds with one another I’d love to hear specific details!
Beyond the philosophical/architectural difference I mentioned above, compilers typically have a one-way mapping between syntax and mapping, but to support things like refactors or assists, you often need to do the opposite: go from semantics to syntax. For instance, if you want to refactor from struct to an enum, you often need to find all instances of said struct, make the semantic change, then construct the new syntax tree from the semantics. For simple transformations like a struct to an enum, a purely syntax-based based approach might work (albeit, at the cost of accuracy if you have two structs with same name), but you start to run into issues when you consider traits, interfaces (for example: think about how a type implements an interface in Go!), or generics.
It doesn't really make sense for a compiler to support above use cases, but they're are _foundational_ to an IDE. However, if a compiler is query-centric (as rustc is), then it's pretty feasible for rustc and rust-analyzer to share, for instance, the trait solver or the borrow checker (we're planning/scoping work on the former right now).
> I have a maybe wrong and bad opinion that LSP is actually at the wrong level. Right now every language needs to implement a from scratch implementation of their LSP server. These implementations are HUGE and take YEARS to develop. rust-analyzer is over 365,000 lines of code. And every language has their own massive, independent implementation.
rust-analyzer a big codebase, but it's also less problematic than the raw numbers would make you think. rust-analyzer has a bunch of advanced functionality (term search https://github.com/rust-lang/rust-analyzer/pull/16092 and refactors), assists (nearly 20% of rust-analyzer!) and tests.
> I think there should be a common Intellisense Database file format for providing LSP or LSP-like capabilities. Ok sure there will still be per-language work to be done to implement the IDB format.
> But you'd get like 95% of the implementation for free for any LLVM language. And generating a common IDB format should be a lot simpler than implementing a kajillion LSP protocols.
I wouldn't say 95%. SCIP/LSIF can do the job for navigation, but that's only a subset of what you want from an IDE. For example:
- Intellisense/autocomplete is extremely latency sensitive where milliseconds count. If you have features like Rust/Haskell's traits/typeclasses that allow writing blanket implementations like `impl<T> SomeTrait for T`, it's often faster to try to solve that trait bound on-the-fly than storing/persisting that data.
- It'd be nice to handle features like refactors/assists/lightbulbs. That's going to result in a bunch of de novo code needs to exist outside of a standard compiler, not counting all the supporting infrastructure.
> My dream world has a support file that contains: full debug symbols, full source code, and full intellisense data.
Rust tried something similar in 2017 with the Rust Language Server (RLS, https://github.com/rust-lang/rls). It worked, but most people found it too slow because it was invoking a batch compiler on every keystroke.
> Otherwise, it is another "rewrite it in rust" post which is yet to materialise into a real project.
I'm not sure you can reasonably said that "it is another 'rewrite it in rust' post..." if it's coming from a QEMU maintainer.
Besides, before starting a large engineering project like this, it's worthwhile to write a high-level proposal _then_ write code. Saves you a lot of wasted effort!
> I think there are open possibilities in making leaderless single-round-trip consensus systems for log-oriented FSMs, which is what pretty much everyone WANTS.
Woah, that's wild. Are there any pre-prints/papers/talks that you can link to on this subject? I'd _love_ to read this.
> I'm also excited about my own research with Elle, but we're still working on getting that through peer review, haha. ;-)
I read over bits of Elle; the documentation in it is absolutely top-notch. You and Peter Alvaro knocked it out of the park!
I think there are open possibilities in making leaderless single-round-trip consensus systems for log-oriented FSMs, which is what pretty much everyone WANTS.
This is based on her presentation and some dinner conversation at HPTS 2019, so I don't know if there's actually a paper I can point to. The gist of is that Paxos normally involves an arbitration phase where there are conflicting proposals, which adds a second pair of message delays. But if you relax the consensus problem to agreement on a set of proposals, rather than a single proposal, you don't need the arbitration phase. Instead of "who won", it becomes "everyone wins". Then you can impose an order on that set via, say, sorting, and iterate to get a replicated log.
I read over bits of Elle; the documentation in it is absolutely top-notch. You and Peter Alvaro knocked it out of the park!
Thank you! Could I... hang on, just let me grab reviewer #1 quickly, I'd like them to hear this. ;-)
> This is based on her presentation and some dinner conversation at HPTS 2019, so I don't know if there's actually a paper I can point to. The gist of is that Paxos normally involves an arbitration phase where there are conflicting proposals, which adds a second pair of message delays. But if you relax the consensus problem to agreement on a set of proposals, rather than a single proposal, you don't need the arbitration phase. Instead of "who won", it becomes "everyone wins". Then you can impose an order on that set via, say, sorting, and iterate to get a replicated log.
This sounds very similar to atomic broadcast (https://en.wikipedia.org/wiki/Atomic_broadcast) where each node sends a single message and the process ensures that all nodes agree on the same set of messages. Not sure how it would fit with a log-oriented FSM, but it certainly sounds interesting.
It’s really pretty trivial to implement RSM given an atomic broadcast protocol. But you can implement many other things, like totally ordered ephemeral messaging with arbitrary fanout, or a replicated durable log ala Kafka. Here’s my current favorite atomic broadcast protocol (from 2007 or so), which is leaderless, has write throughput saturating network bandwidth, and read throughput scaling linearly with cluster size:
We don't share the query infrastructure in rust-analyzer, but rust-analyzer is a query-driven system. We use Salsa (https://github.com/salsa-rs/salsa/), which is strongly inspired by rustc's query system, down to the underlying red/green algorithm (https://rustc-dev-guide.rust-lang.org/queries/incremental-co...), hence the name: Salsa.