If you know the domain the LLM operates in it’s probably fairly easy.
For example let’s say the IRS has an LLM that reads over tax filings, with a couple hundred poisoned SSNs you can nearly guarantee one of them will be read. And it’s not going to be that hard to poison a few hundred specific SSNs.
Same thing goes for rare but known to exist names, addresses etc…
Bobby tables is back, basically
Speaking of which, my SSN is 055-09-0001