> While hallucination is probably closer to 100% depending on the question.
But the benchmark didn't ask those questions, and it seems grok is very well at saying it doesn't know the answer otherwise.
> While hallucination is probably closer to 100% depending on the question.
But the benchmark didn't ask those questions, and it seems grok is very well at saying it doesn't know the answer otherwise.