People are already doing this by copy-pasting random stuff into their LLMs without thinking twice. I think the fixed number vs. percentage thing makes it way more practical for attackers. Would be cool to see defenses at the data ingestion layer!
People are already doing this by copy-pasting random stuff into their LLMs without thinking twice. I think the fixed number vs. percentage thing makes it way more practical for attackers. Would be cool to see defenses at the data ingestion layer!