What do you mean? In most cases, the benchmarks show a larger number for Muse and a smaller number for Opus.
In Multimodal yes, but Opus is definitely edging out in Text/Reasoning and Agentic benchmarks.
I think the general skepticism is because they are late to race, and they are releasing a Opus-4.6-equivalent model now, when Anthropic is teasing Mythos.
In Multimodal yes, but Opus is definitely edging out in Text/Reasoning and Agentic benchmarks.
I think the general skepticism is because they are late to race, and they are releasing a Opus-4.6-equivalent model now, when Anthropic is teasing Mythos.