Hacker News

Judging a conversation transcript is a lot different from being able to interact with an entity yourself. Obviously one could make an LLM look human by having a conversation with it that deliberately stayed within what it was capable of, but judging such a transcript isn't what most people imagine as a turing test.