Idea for a website/documentary -- have experts respond to a piece of news, or provide commentary. Put a few expert pieces alongside a few LLM outputs, have people guess/work out which is which. Have the same people tell you why.
If on a website, rank the results; present the 'how I worked it out' info for the best spotters (and you could interview them). Keep the answers secret for a few weeks, then reveal them in a way that the game is still playable.
It's repeatable, every few months you could interview new experts (or the old ones again), get new models.
Kinda like the critical thinking version of images of a pelican on a bike.
I love your idea and would enjoy seeing the results of that controlled experiment.
I'm also interested in the broader impact of using LLMs in place of web search for general Q&A when we want 'to know things'. It's pretty clear the way LLMs are being used for knowledge acquisition now is often less accurate while 'feeling' more certain. Even if we set aside explicit hallucinations, I suspect it's still less accurate.