When people judge blindly, the are more likely to think the human is the AI and the AI is the human.
73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human.
https://arxiv.org/abs/2503.23674
Not only are people bad at judging this, but are directionally wrong.
There is research showing the contrary that is far more convincing:
> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.
https://arxiv.org/html/2501.15654v2
Great find, I've submitted this preprint as a standalone item: https://news.ycombinator.com/item?id=47678270