Hacker News

littlestymaar 3 days ago [ - ]

Anthropic researchers do that quite a lot, their “escaping agent” (or whatever it was called) research that made noise a few month ago was in fact also a sci-fi roleplay…

XenophileJKO 2 days ago [ - ]

Just to re-iterate again... If I read the paper correctly, there were 0 false positives. This means the prompt never elicited a "roleplay" of an injected thought.