Not sure I understand this. Agent CLIs already use sandbox-exec, and you can configure granular permissions. You are basically saying - give the agents access to everything, and configure permissions in this second sandbox-exec wrapper on top. But why use this over editing the CLI's settings file directly (e.g. https://code.claude.com/docs/en/sandboxing#configure-sandbox...)?

I have sandbox-exec setup for Claude like you suggest, but I’m not sure every CLI supports it? Claude only added it a month or two ago. A wrapper CLI that allows any command to be sandboxed is pretty appealing (Claude config was not trivial).

The downside is that it requires access to more than it technically needs (Claude keys for example). I’m working on a version where you sandbox the agent’s Bash tool, not the agent itself. https://github.com/Kiln-AI/Kilntainers

I like the idea but not the MCP part.

How about using bash-tool to intercept the commands and then passing them onto the containers?

https://github.com/vercel-labs/bash-tool

That's exactly what it does -- the bash commands are passed into the containers. It also manages container lifecycle (starting on first request, cleanup on connection shutdown).

If you're using an agent tool that already includes an existing bash tool which calls host OS, just remove that one and add this.

My bad, looks like I misunderstood how bash-tool works.

Then how about running Claude Code or your harness of choice inside bubblewrap with a shim/stub for the base binary?

https://github.com/containers/bubblewrap

I've had trouble with the sandbox functionality baked into agents being able to do what I want, particularly Gemini CLI. Being able to write your own .sb file is more powerful and portable.

Claude Code seemed to be able to reach outside its own sandbox sometimes, so I lost trust in it. Manually wrapping it in sandbox-exec solved the issue.

I think the idea here is to move the responsibility layer away from the agent, rather than trust the CLI will behave and have to learn specific configs for each (given OP's tool works for any agent, not just Claude), this standardizes and centralizes it.