Hacker News

indentit 8 hours ago [ - ]

But the correct way to do it is to have a separate account with more privileges, and only give AI access to your standard developer account

trick-or-treat 5 hours ago [ - ]

That's one way to do it, how about backup to a remote location every hour? There's more than one way to be careful.

digitaltrees 8 hours ago [ - ]

I have personally seen AI bypass this multiple times.

giancarlostoro 7 hours ago [ - ]

Sounds like they're still giving the model the keys to the kingdom, which is my point, stop giving the model the avenue to do catastrophic mistakes, it makes no sense.

digitaltrees 6 hours ago [ - ]

If you’re message is in response to me, which I think it is, I deliberately don’t give access to credentials and env variables. I’ve worked to create restrictions and seen AI models use very interesting methods to bypass them.

Even now my prompt says the AI must verify the path of the files it intends to edit, and get permission before editing one file at a time and only after permission. I stop it from ignoring those rules once a day at least.

suchar 5 hours ago [ - ]

This is not privilege separation/sandboxing. Separate virtual machine for an agent with limited credentials is reasonably safe approach

Terr_ 8 hours ago [ - ]

We kinda need to architect things with the assumption that all token-output from an LLM can be unpredictably sneaky and malicious.

Alas, humans suck at constant vigilance, we're built to avoid it whenever possible, so a "reverse centaur" future of "do what the AI says but only if you see it's good" is going to suck.

digitaltrees 6 hours ago [ - ]

I built my own IDE to replace vscode / cursor so I could design the harness and ensure that the model tool access was secure and limited. But the rest of the industry is YOLO