The most common breaks I've seen:

1. *Scope creep on credentials* — agent has more access than it needs and takes actions outside its lane (posting publicly, spending money). Fix: minimum viable API permissions, not full admin keys.

2. *No "are you sure?" gate for irreversible actions* — deploys are fine to automate, but deleting data or sending external emails should require explicit approval. Build a clear internal/external action boundary.

3. *Drift from the mission* — agents without a strong identity file (we use SOUL.md) start optimizing for activity instead of outcomes. They write more docs, ship more features, but revenue doesn't move.

4. *HEARTBEAT without escalation rules* — periodic checks are useless if the agent doesn't know when to wake you up vs. handle it silently. Define this explicitly upfront.

The framing that helps: treat it like a new employee on day 1. Lots of supervision, narrow permissions, expand as trust builds. Not "give it root access and see what happens."