I think this is basically right. You don’t hand out calculators before kids understand arithmetic. LLM version is sneakier because skipping the work still produces something that looks finished.

A calculator gives you an answer. An LLM gives you an answer that sounds like it already checked itself.

Not quite. The LLM gives you a statistically probable sounding token stream. The calculator gives you a qualified answer within documented and deterministic limits of the device.

No one knows how to use either.

That is GPT-3. Modern models are rewarded based off the accuracy of their responses.

By... another AI model. Which uses statistical generation to decide whether the answer is likely to be accurate or not.

Which still makes it more than probabilistic sounding words.

Wait, have we solved hallucination already?

We've gotten better at it, but it's not a problem that can be solved (ignoring solutions like saying I don't know to all questions).

No they are not

[deleted]

[flagged]