I’ve been thinking about a similar problem I have on a regular basis: I audit a lot of code in many languages, sometimes I’m not even able to build solutions. my toolset pretty much includes git and rudimentary regex. A recent example is a PHP function which is insecure when it’s used at least 3 times in a string concat.
Any suggestion on an AST ‘framework’ that would help me parse code easier? Language specific or generic, even if it only sort-of fits (I don’t even know what I want).
Automated Code analysers exist, but I want something manual.
I wrote a library that takes a language specification (BNF, ABNF, etc) into a custom AST in Python. Then you can define a set of visitors for the custom AST to transform the tree into whatever you want. I implemented some checks integrated into the Python type system to type-check the visitors and I've used the library to do some non-trivial manipulations before.
It's mostly a pet project and I just wanted to share since it could maybe at least inspire something.
This is not a problem I'd had, but it sounds like something tree-sitter might be able to help with.
I've been using tree-sitter for syntax highlighting, simple refactors, and custom text objects in NeoVim recently (and love it, thanks NeoVim devs!). Have a play with the syntax tree it generates https://tree-sitter.github.io/tree-sitter/playground (it can generate a tree for most languages, not just the ones in this playground).
Any suggestion on an AST ‘framework’ that would help me parse code easier? Language specific or generic, even if it only sort-of fits (I don’t even know what I want).
Automated Code analysers exist, but I want something manual.