AI poisoning is a better protection. Cloudflare is capable of serving stashes of bad data to AI bots as protective barrier to their clients.

AI poisoning is going to get a lot of people killed, be cause the AI won't stop being used.

The current state of the art in AI poisoning is Nightshade from the University of Chicago. It's meant to eventually be an addon to their WebGlaze[1] which is an invite-only tool meant for artists to protect their art from AI mimicry

Nobody is dying because artists are protecting their art

[0] https://nightshade.cs.uchicago.edu/whatis.html

[1] https://glaze.cs.uchicago.edu/webglaze.html

By that logic AI already killing people. We can't presume that whatever can be found on the internet is reliable data, can't we?

If science taught us anything it's that no data is ever reliable. We are pretty sure about so many things, and it's the best available info so we might as well use it, but in terms of "the internet can be wrong" -> any source can be wrong! And I'd not even be surprised if internet in aggregate (with the bot reading all of it) is right more often than individual authors of pretty much anything

Yet we use it every day for police, military, and political targeting with economic and kinetic consequences.

You mean incompetent users of AI will get people killed. You don't get a free pass because you used a tool that sucked.

This is some next level blame shifting. Next you are going to steal motor oil and then complain that your customers got sick when you used it to cook their food.

Okay, let them

You don't think that the AI companies will take efforts to detect and filter bad data for training? Do you suppose they are already doing this, knowing that data quality has an impact on model capabilities?

The current state of the art in AI poisoning is Nightshade from the University of Chicago. It's meant to eventually be an addon to their WebGlaze[1] which is an invite-only tool meant for artists to protect their art from AI mimicry

If these companies are adding extra code to bypass artists trying to protect their intellectual property from mimicry then that is an obvious and egregious copyright violation

More likely it will push these companies to actually pay content creators for the content they work on to be included in their models.

[0] https://nightshade.cs.uchicago.edu/whatis.html

[1] https://glaze.cs.uchicago.edu/webglaze.html

Seems like their poisoning is something that shouldn't be hard to detect and filter on. There is enough perturbation to create visual artifacts people can see. Steganography research is much further along in being undetectable. I would imaging in order to disrupt training sufficiently, you would not be able to have so few perturbations that it would go undetected

They will learn to pay for high quality data instead of blindly relying on internet contents.