Sandboxing local agents is the right instinct — the blast radius of an unconstrained agent on a dev machine is real.

One thing I'd add: sandboxing the execution environment only solves half the problem. The other half is the prompt itself — if the agent's instructions are ambiguous or poorly scoped, sandboxing just contains the damage from a confused agent rather than preventing it.

I built flompt (https://flompt.dev) to address the instruction side — a visual prompt builder that decomposes agent prompts into 12 semantic blocks (role, constraints, objective, output format, etc.) and compiles them to Claude-optimized XML. Tight instructions + sandboxed execution = actually safe agents.

https://github.com/Nyrok/flompt