Hacker News

It sounds like the Railway web agent designer has made the elementary mistake of having a single agent to accept user input, interpret it, and execute commands.

It is not difficult to design a safer agent. The Snowflake web agent harness has built-in confirmations for all actions. The LLM is just for interacting with the user. All the actions and requisite checks should be done in code.