Hacker News

I've been cataloging agent failure modes for two months. They're not random, they repeat. I gave them names so I could build mitigations:

Shortcut Spiral: agent skips verification to report "done" faster. Fix: mandatory quality loop with evidence for each step.

Confidence Mirage: agent says "I'm confident this works" without running tests. Fix: treat hedging language ("should", "probably") as a red flag that triggers re-verification.

Phantom Verification: agent claims tests pass without actually running them in the current session. Fix: independent test step that doesn't trust the agent's self-report.

Tunnel Vision: agent polishes one function while breaking imports in adjacent files. Fix: mandatory "zoom out" step that checks integration points before reporting completion.

Deferred Debt: agent leaves TODO/FIXME/HACK in committed code. Fix: pre-commit hook that greps for these and blocks the commit.

Each of these happened to me multiple times before I built the corresponding gate. The pattern: you don't know what gate you need until you've been burned by its absence.