The issue is to make sense of the incredibly detailed AST to answer various questions about the code base. For example, how to make an information flow graph that shows what functions read and write what variables in a set of C++ classes.
I had tried this with the previous versions of sonnet and gemini pro. Sonnet's context, back then, could not hold the full source I was working on. Reducing the context did allow it produce a graph. Both LLMs produced graphs with enough omissions and errors to make the result not useful. In the end, I wrote an interpreter based on libclang to provide the semantics I needed for my particular case. That was not trivial for me (I have decades of experience with s/w dev, working with graphs, etc - but not compiler development). And I used LLM's help to do the development. The new type of semantics would require hard-coding new AST interpreter and graph construction. Repeating that all today with the better LLMs and after more practice driving them might produce something with less effort / more flexibility.
In any case, this experience gave me a new appreciation for compiler developers!
The issue is to make sense of the incredibly detailed AST to answer various questions about the code base. For example, how to make an information flow graph that shows what functions read and write what variables in a set of C++ classes.
This sounds like it might be a good use case for one of the LLM coding tools.
A pure AST wouldn't even have that information - it'd have the syntax, but not the semantics.
I had tried this with the previous versions of sonnet and gemini pro. Sonnet's context, back then, could not hold the full source I was working on. Reducing the context did allow it produce a graph. Both LLMs produced graphs with enough omissions and errors to make the result not useful. In the end, I wrote an interpreter based on libclang to provide the semantics I needed for my particular case. That was not trivial for me (I have decades of experience with s/w dev, working with graphs, etc - but not compiler development). And I used LLM's help to do the development. The new type of semantics would require hard-coding new AST interpreter and graph construction. Repeating that all today with the better LLMs and after more practice driving them might produce something with less effort / more flexibility.
In any case, this experience gave me a new appreciation for compiler developers!