Current LLMs are optimized to produce output most resembling what a human would generate. Not surpass it.

The output most pleasing to a human, which is both better and worse.

Better, when we spot mistakes even if we couldn't create the work with the error. Think art: most of us can't draw hands, but we can spot when Stable Diffusion gets them wrong.

Worse also, because there are many things which are "common sense" and wrong, e.g. https://en.wikipedia.org/wiki/Category:Paradoxes_in_economic..., and we would collectively down-vote a perfectly accurate model of reality for violating our beliefs.