"Make no mistakes"
"don't do something that would make me get mad at you."
These prompts sound like abusive relationships.
> "NEVER FUCKING GUESS!"
"Oops, I guessed! I'm Sorry~~ uWu!!"
- Claude Opus 4.6, when asked to run a root cause analysis on itself
hmmmm ok, what if we add a bit more profanity to that? perhaps some extra exclamation marks? maybe that'll make the agents actually follow the rules?
"don't do something that would make me get mad at you."
These prompts sound like abusive relationships.
> "NEVER FUCKING GUESS!"
"Oops, I guessed! I'm Sorry~~ uWu!!"
- Claude Opus 4.6, when asked to run a root cause analysis on itself
hmmmm ok, what if we add a bit more profanity to that? perhaps some extra exclamation marks? maybe that'll make the agents actually follow the rules?