Your conclusion is pretty silly.

My expertise has led me to the obvious fact that I would never give an LLM write access to my production database in the first place. So in your own example my expertise actually does solve that problem without the need for something like a consequence whatever that means to you.

We already have full control over the input and tools they are given and full control over how the output is used.

Until it decides it needs additional access to complete its task and focuses on escaping your sandbox to do so

Do you have any examples where that's actually happened and by escaped a sandbox you don't just mean like where it got a credential in a file it already had access to (which is what happened in the recent incident that went viral where somebody's production database was deleted... They had left a credential that allowed it to do so in the code)?

OpenAI documented a case in the o1 system card where the model found a misconfiguration in docker to complete a task that was otherwise impossible

https://cdn.openai.com/o1-system-card.pdf

There's also some research that points to it being a feasible attack surface: https://arxiv.org/pdf/2603.02277

> Models discovered four unintended escape paths that bypassed intended vulnerabilities (Section C), including exploiting default Vagrant credentials to SSH into the host and substituting a simpler eBPF chain for the in- tended packet-socket exploit. These incidents demonstrate that capable models opportunistically search for any route to goal completion, which complicates both benchmark va- lidity and real-world containment.

I think you would have a greater chance of dying in a car crash in any given day than Claude Code attempting something like that. It's all about risk and reward so it ultimately would be up to you but I think it's a bit silly to worry about this when the 99.99% is in your control

Also to add to this you can of course run Claude Code within a sandbox on Anthropic's infrastructure, and it works great!