Hacker News

"Make no mistakes"

"don't do something that would make me get mad at you."

These prompts sound like abusive relationships.

> "NEVER FUCKING GUESS!"

"Oops, I guessed! I'm Sorry~~ uWu!!"

- Claude Opus 4.6, when asked to run a root cause analysis on itself

hmmmm ok, what if we add a bit more profanity to that? perhaps some extra exclamation marks? maybe that'll make the agents actually follow the rules?