The D language compiler uses a technique I call "poisoning" which has greatly re...

munificent · on May 6, 2024

I've got a compiler for a hobby language that uses this technique too (I probably got it from you, unless I heard it from someone else). It's really really nice. Super easy to implement and really does cut down on cascaded errors.

I also use it during type checking. There is a special "error" type. If an expression produces a type error, the yielded type of that expression becomes "error". Surrounding expressions that consume that type will see the error type and suppress any other type errors they might otherwise produce. That way, you only see the original type error.

tnh · on May 6, 2024

> Surrounding expressions that consume that type will see the error type and suppress any other type errors they might otherwise produce.

We added a slightly-cursed version of this to clang. The goal was: include more broken code in the AST instead of dropping it on the floor, without adding noisy error cascades.

The problem is, adding a special case to all "surrounding expressions that consume that type" is literally thousands of places. It's often unclear exactly what to do, because "consume" means so many things in C++ (think overload resolution and argument-dependent lookup) and because certain type errors are used in metaprogramming (thanks, SFINAE). So this would cost a lot of complexity, and it's too late to redesign clang around it.

But C++ already has a mechanism to suppress typechecking! Inside a template, most analysis of code that depends on a template parameter is deferred until instantiation. The implementation of this is hugely complicated and expensive to maintain, but that cost is sunk. So we piggy-backed on this mechanism: clang's error type is `<dependent type>`. The type of this expression depends on how the programmer fixes their error :-)

And that's the story of how C gained dependent types (https://godbolt.org/z/szGdeGhrr), because why should C++ have all the fun?

(This leaves out a bunch of nuance, of course the truth is always more complicated)

WalterBright · on May 7, 2024

In D, all semantic analysis of a template waits until instantiation time. This is because D is designed so that the syntax parsing does not need a symbol table.

In C++, I solved this problem by simply matching { } in the template body, and accumulating a list of tokens within the { }. Then, when instantiated, the template parameter values were known, and the template syntax could then be semantically analyzed. It was simple and effective.

But I was informed that C++ required the syntax parsing and semantics for non-dependent types without instantiation. I asked why, and the answer was "to check for errors without needing to instantiate it." I responded with "of what use is checking it if it is never used or tested?" And that was the end of that.

> The implementation of this is hugely complicated and expensive to maintain

I quietly revolted and refused to implement that disaster. AFAIK there was never a problem with deferring parsing/semantic until instantiation.

Matheus28 · on May 7, 2024

Isn't that what msvc does? I remember it's a little weird about when it actually checks for errors in templated code

tnh · on May 8, 2024

Yes (at least approximately, I'm fuzzy on the details).

These days it supports both. (IIRC the default is legacy/nonstandard, you select the standard behavior with /fpermission-, and VS adds /fpermission- to newly generated projects)

https://devblogs.microsoft.com/cppblog/two-phase-name-lookup...

WalterBright · on May 8, 2024

I have no idea what msvc does with this.

tnh · on May 8, 2024

Yeah, that trade-off makes a lot of sense. It's the logical conclusion of the "templates are textual" model. But C++ loves to have its cake and eat it regardless of the complexity, so it's non-conforming.

(I expect it's possible to construct cases where this difference is observable)

I think checking templates in isolation has value. We use statically typed languages in part to make more error classes locally-verifiable. But bolting that into a mostly-textual system is a mess.

(Checking templates in isolation is particularly valuable in IDEs, which tend to share logic with compiler frontends. IDEs only need that much power to do a passable job because the language is so complex, so I don't know which way this argument points)

munificent · on May 7, 2024

I love/hate/but-really-love/but-totally-hate this.

nathan_douglas · on May 6, 2024

Funny, I was just gonna comment and say that I learned this One Weird Trick… and then realized I’d learned it from you.

(Hi, Bob! I was actually just using your recursive shadowcasting algorithm as a reference yesterday.)

I feel like I just got an Erdos number or something.

munificent · on May 7, 2024

WalterBright · on May 6, 2024

Yup, dmd also has an error type:

https://github.com/dlang/dmd/blob/master/compiler/src/dmd/mt...

kazinator · on May 7, 2024

You can trivially reduce the number of messages to one by bailing on the first error.

pvillano · on May 13, 2024

That makes error driven development difficult. I think a compiler should attempt to emit one error message per human error

lifthrasiir · on May 7, 2024

I think this is very commonly used and possibly independently discovered a lot of times---I had done so for example---because it is a very logical conclusion. Clang has `RecoveryExpr` [1] for this purpose (I couldn't find a GCC equivalent).

[1] https://clang.llvm.org/doxygen/classclang_1_1RecoveryExpr.ht...

jansvoboda11 · on May 6, 2024

Do you cut off this poisoning at any point?

WalterBright · on May 6, 2024

If something else depends on the successful semantic analysis of those poisoned nodes, they get poisoned as well. It's all based on dependency.

tester756 · on May 6, 2024

But with this approach you arent able to provide e.g intellisense or other IDE hints for valid things within this node, right?

tnh · on May 6, 2024

Things within the node are fine because errors propagate upwards. The problem is things like:

    Foo getFoo(int);
    getFoo().???; // want code completion!

`getFoo()` is missing an argument, but you still want to complete Foo's members. If you type `getFoo().getBar()` you want go-to-definition to work on `getBar`.

In clang, we use heuristics to preserve the type in high-confidence cases. (here overload resolution for `getFoo` failed, but there was only one candidate function). This means you can get cascading errors in those cases (but often that's a good thing, especially in an IDE - tradeoffs).

WalterBright · on May 6, 2024

A hint can be provided where the original error is diagnosed. I doubt hints on cascaded errors would be of any use.

tester756 · on May 6, 2024

if you have function like

"public intd Test() { ... typing new line }"

then you will not provide hints when writing that new line due to "intd" being invalid type?

WalterBright · on May 6, 2024

A function's declaration is not affected by its body, so that shouldn't be a problem.