Sure, the experimental, agentically-developed code should be tested in a sandbox. This sandbox should contain the damage of the code execution when it goes wrong.
But shouldn't there really be another sandbox where the agentic tool calls execute? This is to contain the damage of the tool execution when it goes wrong.
And, the agent harness itself should either implement or be contained in a third sandbox, which should contain the damage of the agent. There should be a firewall layer to limit what tool requests the agent can even make. This is to contain the damage of the agent when it formulates inappropriate requests.
The agent also should not possess credentials, so it cannot leak them to the LLM and allow them to be transformed into other content that might leak out via covert channels.
Yes, it's also because the agent described in the post is doing some operations on the user code (fix CI pipelines, rerun tests, fix them, etc...). So another big reason to use the sandbox is to run things like bash on a user code. you don't want credentials or anything trusted inside that sandbox, including the LLM api key.
Author here. Depending on how it’s designed, the harness itself doesn’t need any sandboxing.
At the end of the day, it’s a “simple” loop that calls an external API (LLM) and receives requests to execute stuff on its behalf.
It’s not the agent running bash commands: you (the harness author) are, and you’re in full control of where and how those commands get executed.
In the article’s case, bash commands are forwarded to a sandbox, nothing ever runs on the harness itself (it physically can’t, local execution is not even implemented in the harness).