This what hooks[1] are for, except hooks allow specifying criteria in certain conditions (like the agent believing it’s done and ready to hand back to the user) in a manner that the agent won’t just forget about once it’s a few turns deep, and doesn’t require triggering a whole other LLM instance to read some plain text instructions while you hope it interprets them correctly.
It absolutely makes sense to have a system in place that allows the code generated by an LLM to be automatically validated but there’s no need to resort to a non-deterministic system for these sort of deterministic pass/fail conditions.
I was thinking the exact same thing. There are multiple places to implement hooks (git hooks, Claude hooks, etc.).
One thing I've been wondering about is how to reliably protect specific portions of the system from unexpected/unnecessary change (for example, a failing test that Claude decides to comment out or rewrite to get it to pass). My only thought for this was to automatically revert test changes during specific portions of the implementation, but that feels overly rigid and potentially prevents things like refactoring code.