Nobody is that naive

nobody is that naive... to do what? to ablate/abliterate bad information from their LLMs?

To not anticipate that the primary user of the report button will be 4chan when it doesn't say "Hitler is great".

Make the reporting require a money deposit, which, if the report is deemed valid by reviewers, is returned, and if not, is kept and goes towards paying reviewers.

You're asking people to risk losing their own money for the chance to... Improve someone else's LLM?

I think this could possibly work with other things of (minor) value to people, but probably not plain old money. With money, if you tried to fix the incentives by offering a potential monetary gain in the case where reviewers agree, I think there's a high risk of people setting up kickback arrangements with reviewers to scam the system.

... You want users to risk their money to make your product better? Might as well just remove the report button, so we're back at the model being poisoned.

... so give reviewers a financial incentive to deem reports invalid?

Your solutions become more and more unfeasable. People would report less or anything at all if it costs money to do so, defeating the whole purpose of a report function.

And if you think you're being smart by gifting them money or (more likely) your "in-game" currency for "good" reports, it's even worse! They will game the system when there's money to be made, who stops a bad actor from reporting their own poison? Also who's going to review the reports and even if they finance people or AI systems to do that, isn't that bottlenecking new models if they don't want the poison training data to grow faster than it can be fixed? Let me make a claim here: nothing beats fact checking humans to this day or probably ever.

You got to understand that there comes a point when you can't beat entropy! Unless of course you live on someone else's money. ;)