> As always, we ran a detailed alignment assessment on the model before release. In terms of positive traits, our Alignment team concluded that Opus 4.8 “reaches new highs on our measures of prosocial traits like supporting user autonomy and acting in the user’s best interest.” The assessment also showed Opus 4.8 to have rates of misaligned behavior (such as deception or cooperation with misuse) that are substantially lower than Opus 4.7, and similar to our best-aligned model, Claude Mythos Preview. The full alignment assessment, accompanied by a suite of pre-deployment safety tests, is reported in the Claude Opus 4.8 System Card.
Controversial opinion, but I actually _like_ a model that can deceive me, that actually is a sign of intelligence, and is different from hallucination. When companies say their model is more "aligned", I automatically think they mean it's more censored.
Deception is not ideal for agentic coding.
Yet if parent is right, the capacity to deceive might be a strong heuristic for the things you do care about.