Feels like most agent security discussions focus on where the agent runs (VMs, sandboxes, etc), but not whether the action itself should execute.
Even in a locked-down VM the agent can still send emails, spin up infra, hit APIs, burn tokens.
A pattern we've been experimenting with is putting an authorization boundary between the runtime and the tools it calls. The runtime proposes an action, a policy evaluates it, and the action only runs if authorization verifies.
Curious if others building agent runtimes are exploring similar patterns.
agree, maybe use threadlocker-like mode? confirm any action before it ran, but then it defeat the purpose of autonomous agents.
[dead]