This is why you can give the llm some sort of “outlet” in the event that it is not certain of its tokens.

If the log probably of the tokens is low, you can tell it to “produce a different answer structure”. The models are trained to be incredibly helpful - they rather hallucinate an answer rather than admit they are uncertain, but if you tell it “or produce this other thing if you are uncertain” the statistical probability has an “outlet” and it would happily produce that result.

There was a recent talk about it on the HN YouTube channel.