At least on their benchmark, the regular, public GPT-5.5 is basically at Mythos level already. (2% difference on CyberGym)
They didn't test Opus 4.8, but it probably isn't very far behind.
At least on their benchmark, the regular, public GPT-5.5 is basically at Mythos level already. (2% difference on CyberGym)
They didn't test Opus 4.8, but it probably isn't very far behind.