Hacker News new | past | comments | ask | show | jobs | submit login

Note that PEGs are not context-free grammars. They're both more and less powerful than traditional CFGs, and they're tricky to use: because PEG choice is ordered and traditional CFG choice is unordered, it's hard to translate standard language grammars to a PEG recognizer system. That's why, for my forever-project, I've oped to use scannerless GLR instead of PEGs. Both PEGs and GLR recognize languages that are closed under composition (the property that gives you extensibility), but the formalisms for GLR parsers are much better.

The Harmonia project is the best whack at the problem I've seen. See http://harmonia.cs.berkeley.edu/papers/twagner-parsing.pdf.

As others have mentioned, for an IDE, you also want strong error recovery. Doing that in a general way when using tools based on declarative grammars is, well, very hard, especially when you want to recover from brace mismatch problems. The best approach is "island and reef parsing", where you actually parse your buffer twice: you first build a map of all the "reefs" (parenthesis) using a simple recursive descent parser, pair up mismatched parenthesis using an ad-hoc algorithm, insert corrections for mismatches, then apply your fully general parser to the result. (The word "parenthesis" here refers to any balanced construct, even "begin" and "end". You can actually infer what the "parenthesis" for a given language are by examining the grammar!)

See also http://fileadmin.cs.lth.se/cs/Personal/Emma_Soderberg/docs/S...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: