I'm also bemused by the number of people who think they've got an effective sandbox yet their sandboxed agent has access to all of their code, their github, and unrestricted web access.
I'm also bemused by the number of people who think they've got an effective sandbox yet their sandboxed agent has access to all of their code, their github, and unrestricted web access.
I keep telling folks that they need to imagine LLMs (even "local" ones) as if you're farming it out to JS code running on some dude's browser somewhere: It can't keep a secret, and a determined person can make it emit anything they like.
We need to be asking what the most devious and malicious output could be, and whether what we do with that output (e.g. arguments to command-line tools) would still be safe.
From my perspective, everyone is doing it. Security through obscurity - obviously if you’re harboring credit card numbers of users personal details, maybe take heed. But, if you’re a regular… run of the mill CRUD application, every other company is ALSO throwing caution to the wind. When hundreds of thousands of credentials are leaked into the funnel, does it really matter?
I’m at a small company, and I try to push for security as much as I can, but the stakeholders truly do not care. They want to move fast. It’s just part of the new world I guess. If we get hit by attackers? I don’t know what happens. Sorry, we told you not to - you wanted to move quick and break stuff, this is how that culminates.
I’m sure I’m not the only one.
The answer to that question seems obvious: No, it is not safe.
Yet with tens of millions of developers using these tools, there have not been widespread incidents of this sort as far as I know.
So it leaves me with a few choices:
- manually review and approve each command: obviously not realistic, you would just click Approve
- use a sandbox and hope the exploit is not devious enough to escape the sandbox when you run or open the project outside of the sandbox
- use AI without web access and limit other external dependencies
- don't use agentic AI
- use Claude or Codex auto approval classifier and hope for the best
Personally, I'm going with the last option for now.
We do have ways to avoid giving an LLM any secrets, but it needs to be the simple, default solution.
> yet their sandboxed agent has access to all of their code, their github, and unrestricted web access.
Not in my sandbox. It gives no direct access to the workdir, no access to my github, my ssh keys, my security tokens or API keys. No access to my home dir or dotfiles. Nothing at all, except for what I explicitly tell it to give access to.
I can restrict network access. I can choose the isolation level: docker containers, Kata VMs, seatbelt, tart, even the new apple containers (which are VERY nice).
Not even ENV leaks through.
And it's FOSS: https://github.com/kstenerud/yoloai
One bad npm package can really ruin your day. These things for me only run in their own VM with it's own GitHub account and basically nothing else
People probably think you’re being ridiculous but Shai Hulud had its very first attempt at manipulating AI lead analysis and I know of at least one company where that resulted in them getting pwned.
This is only going to become more of a problem in the future and people need to educate themselves on the technical barriers to use because guardrails only sometimes work.
If anyone's looking to sandbox network, I've had good experience with pasta [1] networking. I make a pasta+bwrap sandbox and expose only specific services via local sockets to cross the boundary.
[1]: https://passt.top/passt/
I use a separate physical machine and a scoped token with access to a single repository at a time, and even then I worry about what hole I may have left open.
The general carelessness of the average user is baffling.
[flagged]