> I think the issues is, it is going against a very well established pattern. I have a tool that wraps ripgrep so that search results always includes context and from time to time, the agent will use ripgrep by itself and when I ask why, it would go "yeah I should have done that"
There's a limit of how many simultaneous instructions an agent can follow (the exact number depends on the specific model so instructions that are fine for one model may overwhelm another). If this keeps happening, consider trimming your instructions or even better, solving it at the harness level (like intercepting and rewriting ripgrep calls to use your thing, like rtk [0] does in agents that supports this)
Overall, never leave to an agent an instruction that must be followed at all times. For example, doing things in a git hook beats a multi-command workflow every time the agent commit, etc.
Is this state of things forever? I don't think so. Very soon models will become so better this will be a non-problem