> I DO have a major issue with my sites being crawled extremely aggressively by offenders including Meta, Perplexity and OpenAI

Gee, if only we had, like, one central archive of the internet. We could even call it the internet archive.

Then, all these AI companies could interface directly with that single entity on terms that are agreeable.

Internet Archive is missing enormous chunks of the internet though. And I don't mean weird parts of the internet, just regional stuff.

Not even news articles from top 10 news websites from my country are usually indexed there.

So then make a better one. I was only referencing it as a general concept that can be approved upon as desired.

you think they care about that ? they’d still crawl like this just in case which is why they don’t rate limit atm

It would of course need to be legally enforced somehow, with penalties high enough to hurt even the big players.