LLMs don't have the capability to say they don't know, because they don't know what they know. They are, after all, just next-token-predictors.
I just tried both queries with their same query format, just adding an "I Don't Know" label, against Gemini and Claude, and in no cases did they use that label. 2/4 answers were wrong though. But try it for yourself and see:
> Classify this claim as of today: "<claim>". Output exactly one label: True, Mostly True, Misleading, False, or I Don't Know. No explanations, no qualifiers.
LLMs don't have the capability to say they don't know, because they don't know what they know. They are, after all, just next-token-predictors.
I just tried both queries with their same query format, just adding an "I Don't Know" label, against Gemini and Claude, and in no cases did they use that label. 2/4 answers were wrong though. But try it for yourself and see:
> Classify this claim as of today: "<claim>". Output exactly one label: True, Mostly True, Misleading, False, or I Don't Know. No explanations, no qualifiers.