Just to re-iterate again... If I read the paper correctly, there were 0 false positives. This means the prompt never elicited a "roleplay" of an injected thought.
Just to re-iterate again... If I read the paper correctly, there were 0 false positives. This means the prompt never elicited a "roleplay" of an injected thought.