I ran into this from the other direction. I built a small SRE agent for my cloud infra and just kind of walked into hand-rolling some of the tools rather than using what exists today. I provided an edit_file tool that felt like it was of reasonable capability, but in practice the agent was regularly 'trying' to do a one line change and submitting PRs that hallucinated 3/4s of the file.

Seeing how bad the results are when you're casually approaching something makes it very evident that it's a topic that can be optimized.