> If we had just trusted its output, we would now have a security vulnerability in production, allowing anyone to access other people's accounts.
This is one reason you always get a different model to review a model's PR. Gemini Or GPT-codex would have certainly noticed the missing auth.