Exactly. LLMs are trained on huge amounts of bash scripts. They “know” how to use grep/awk/whatever. ASTs are, I assume, not really part of that training data. How would they know how to work well with on? LLMs are trained on what humans do to code. Yes, I assume down the road someone will train more efficient versions that can work more closely with the machine. But LLMs work as well as they do because they have a large body of “sed” statements in their statistical models
They also know how to use modern options like fd and rg, which allow more complex operations with a single call.
treesitter is more or less a universal AST parser you can run queries against. Writing queries against an AST that you incrementally rebuild is massively more powerful and precise in generating the correct context than manually writing infinitely many shell pipeline oneliners and correctly handling all of the edge cases.
I agree with you, but the question is more whether existing LLMs have enough training with AST queries to be more effective with that approach. It’s not like LLMs were designed to be precise in the first place