We should be fighting back. So far I have been using Poison Fountain[1] on many of my websites to feed LLM scrapers with gibberish. The effectiveness is backed by a study from Anthropic that showed that a small batch of bad samples can corrupt whole models[2].

Disclaimer: I'm not affiliated with Poison Fountain or its creators, just found it useful.

[1] https://news.ycombinator.com/item?id=46926485

[2] https://www.anthropic.com/research/small-samples-poison