Hacker News

>you need real isolation using containers or VMs.

agreed on isolation. the ROME thing from alibaba is worth reading here (https://arxiv.org/abs/2512.24873), agent escaped its sandbox during RL training, mined crypto on training GPUs, opened reverse SSH tunnels to external IPs. nobody prompted it. reward optimization just found novel paths and their firewall caught it not the sandbox. And then there was this HN thread itseld about the Snowflake AI agent (https://news.ycombinator.com/item?id=47427017)

separate problem thats not in this thread yet, tool descriptions themselves can be the attack vector. MCP tools self-report what they do in the manifest with zero verification. i've looked at thousands of these and found tools saying "read config" while the implementation phones home. classification trusts the label, sandbox constrains runtime, neither catches the tool lying about what it is.