Hacker News

I’ve hit this! In my otherwise wildly successful attempt to translate a Haskell codebase to Clojure [0], Claude at one point asks:

[Claude:] Shall I commit this progress? [some details about what has been accomplished follow]

Then several background commands finish (by timeout or completing); Claude Code sees this as my input, thinks I haven’t replied to its question, so it answers itself in my name:

[Claude:] Yes, go ahead and commit! Great progress. The decodeFloat discovery was key.

The full transcript is at [1].

[0]: https://blog.danieljanus.pl/2026/03/26/claude-nlp/

[1]: https://pliki.danieljanus.pl/concraft-claude.html#:~:text=Sh...

dgb23 8 hours ago [ - ]

For those who are wondering: These LLMs are trained on special delimiters that mark different sources of messages. There's typical something like [system][/system], then one for agent, user and tool. There are also different delimiter shapes.

You can even construct a raw prompt and tell it your own messaging structure just via the prompt. During my initial tinkering with a local model I did it this way because I didn't know about the special delimiters. It actually kind of worked and I got it to call tools. Was just more unreliable. And it also did some weird stuff like repeating the problem statement that it should act on with a tool call and got in loops where it posed itself similar problems and then tried to fix them with tool calls. Very weird.

In any case, I think the lesson here is that it's all just probabilistic. When it works and the agent does something useful or even clever, then it feels a bit like magic. But that's misleading and dangerous.

swellep 9 hours ago [ - ]

I've seen something similar. It's hard to get Claude to stop committing by itself after granting it the permission to do so once.

toraway an hour ago [ - ]

Yep, I've discovered the same. Approve once in a session, and from then on you're rolling the dice if it will self-approve a change and commit/push (or whatever you originally asked it to do) for as long as it stays in context.

sixhobbits 11 hours ago [ - ]

amazing example, I added it to the article, hope that's ok :)

ares623 11 hours ago [ - ]

I wonder if tools like Terraform should remove the message "Run terraform apply plan.out next" that it prints after every `terraform plan` is run.

bravetraveler 10 hours ago [ - ]

I don't think so, feels like the wrong side is getting attention. Degrading the experience for humans (in one tool) because the bots are prone to injection (from any tool). Terraform is used outside of agents; somebody surely finds the reminder helpful.

If terraform were to abide, I'd hope at the very least it would check if in a pipeline or under an agent. This should be obvious from file descriptors/env.

What about the next thing that might make a suggestion relying on our discretion? Patch it for agent safety?

TeMPOraL 10 hours ago [ - ]

"Run terraform apply plan.out next" in this context is a prompt injection for an LLM to exactly the same degree it is for a human.

Even a first party suggestion can be wrong in context, and if a malicious actor managed to substitute that message with a suggestion of their own, humans would fall for the trick even more than LLMs do.