They claim extreme performance on ExploitBench, which Mythos was touted as being incredible at. https://x.com/OpenAI/status/2070555278576439306

My guess is that it's same base model as 5.5, but with additional post-training to improve and benchmaxx on a few things like that.

If they really thought it was competitive with Mythos/Fable across the board, then why wouldn't they release a broader set of benchmarks, and why price it day 1 at 1/2 the cost of Fable?

>and why price it day 1 at 1/2 the cost of Fable?

Why would they price it the same as Fable it it doesn't cost the same as Fable ?

That's half my point - Anthropic's remarks suggest that is Fable significantly bigger (hence more costly to run) than Opus, so it is priced accordingly, but GPT 5.6 priced the same as 5.5 is one datapoint that suggests they are the same size.

On graph, they are still slightly bellow Mythos. Maybe enough to not be prohibited by US government?