Just swap 'Honesty' with 'correctness in its claims' and you'll get what you need out of this aspect of the model description.
Just swap 'Honesty' with 'correctness in its claims' and you'll get what you need out of this aspect of the model description.
Honesty and correctness are not the same thing, even when talking about LLMs. Sometimes an LLM says a false thing and you don't know whether it's being dishonest or merely incorrect. Sometimes, however, you can see in the CoT that the model does know the true fact and is reasoning about how to deceive the user. That's lying, not just being incorrect.