Thank you for your work - I have sent many of your links to my people.
Your point is totally fair for evaluating security tooling. A few notes -
1. I implemented this in Bash to avoid having an opaque binary in the way.
2. All sandbox-exec profiles are split up into individual files by specific agent/integration, and are easily auditable (https://github.com/eugene1g/agent-safehouse/tree/main/profil...)
3. There are E2E tests validating sandboxing behavior under real agents
4. You don't even need the Safehouse Bash wrapper, and can use the Policy Builder to generate a static policy file with minimal permissions that you can feed to sandbox-exec directly (https://agent-safehouse.dev/policy-builder). Or feed the repo to your LLMs and have them write your own policy from the many examples.
5. This whole repo should be a StrongDM-style readme to copy&paste to your clanker. I might just do that "refactor", but for now added LLM instructions to create your own sandbox-exec profiles https://agent-safehouse.dev/llm-instructions.txt
I love this implementation. Do you find the SBPL deficient in any ways?
Would xcodebuild work in this context? Presumably I'd watch a log (or have an agent) and add permissions until it works?
SBPL is great for filesystem controls and I haven’t hit roadblocks yet. I wish it offered more controls of outbound network requests (ie filtering by domain), but I understand why not.
Yes, Safehouse should work for xcodebuild workloads in the way you described - try to run it, watch for failures, extend the profile, try again. Your agent can do this in a loop by itself - just feed it the repo as there are many integrations that are not enabled by default that will help it.
For anyone reading this later.
I read a little from sandvault and they suggest sandbox-exec doesn't allow recursive sandboxing, so you need to set flags on xcodebuild and swift to not sandbox in addition to the correct SBPL policy.
(I don't think sandvault has a swift/xcode specific policy because they're dumping everything into a sandvault userspace. And it doesn't really concern itself with networking afaict either.)
Yes, you're correct about 'no nested sandboxing'.
This also applies to sandboxing an Electron app: Electron has its own built-in sandboxing via sandbox-exec, so if you're wrapping an Electron app in your own sandboxing, you have to disable that inner sandbox (with Electron's --no-sandbox or ELECTRON_DISABLE_SANDBOX=1). In the repo, I have examples for minimal sandbox-exec rules required to run Claude Code[1] and VSCode[2] (so you can do --dangerously-skip-permission in their destop app and VSCode extension)
[1] https://github.com/eugene1g/agent-safehouse/blob/a7377924efa...
[2] https://github.com/eugene1g/agent-safehouse/blob/a7377924efa...