Hacker News

anon7000 3 days ago [ - ]

Exactly. LLMs are trained on huge amounts of bash scripts. They “know” how to use grep/awk/whatever. ASTs are, I assume, not really part of that training data. How would they know how to work well with on? LLMs are trained on what humans do to code. Yes, I assume down the road someone will train more efficient versions that can work more closely with the machine. But LLMs work as well as they do because they have a large body of “sed” statements in their statistical models

theshrike79 2 days ago [ - ]

They also know how to use modern options like fd and rg, which allow more complex operations with a single call.

FuckButtons 3 days ago [ - ]

treesitter is more or less a universal AST parser you can run queries against. Writing queries against an AST that you incrementally rebuild is massively more powerful and precise in generating the correct context than manually writing infinitely many shell pipeline oneliners and correctly handling all of the edge cases.

anon7000 3 days ago [ - ]

I agree with you, but the question is more whether existing LLMs have enough training with AST queries to be more effective with that approach. It’s not like LLMs were designed to be precise in the first place