> you're wondering if the answer the AI gave you is correct or something it hallucinated

Regular research has the same problem finding bad forum posts and other bad sources by people who don't know what they're talking about, albeit usually to a far lesser degree depending on the subject.

Yes but that is generally public, with other people able to weigh in through various means like blog posts or their own paper.

Results from the LLM are your eyes only.

The difference is that llms mess with our heuristics. They certainly aren’t infallible but over time we develop a sense for when someone is full of shit. The mix and match nature of llms hides that.

You need different heuristics for LLMs. If the answer is extremely likely/consistent and not embedded in known facts alarm bells should go off.

A bit like the tropes in movies where the protagonists get suspicious because the antagonists agree to every notion during negotiations because they will betray them anyway.

The LLM will hallucinate a most likely scenario that conforms to your input/wishes.

I do not claim any P(detect | hallucination) but my P(hallucination | detect) is pretty good.