> We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.
So much for all of the rhetoric about Mythos supposedly far surpassing GPT 5.5 (edit: in cybersecurity, in particular). Of course, the AISI benchmarks also showed this, but it is amusing that Anthropic is saying it now that it is to their advantage.
They aren't saying that other models have the same overall level of capability. They are saying that the specific capability that the US Government tested is also available in other models.
That might also continue to anger the current administration, should they feel the need to, as it openly shared with other actors how to achieve the same capability. If they choose not to apply the same restriction to GPT 5.5 then an argument could be made that Anthropic is being singled out by the government.
This is about the specific capabilities that the government called out, not Fable's overall capabilities. My personal experience, having used Fable this week for an extremely complex task, is that it is head and shoulders more powerful than any other model, at least for software engineering.
If this gets 5.5 banned I am going to be hopping mad.
I wonder how many OpenAI employees astro-turf like this.
The best time to get mad was yesterday, when Amodei explicitly asked Trump to do something like this. But now works, too.
Amodei never asked Trump to do this, he asked for an approval process to get powerful models safely in the hands of the public.
It's a shame HN's critical thinking has gone to shit though.
That's what happens when you beg for government's involvement. They might get involved, but not on your terms.
Although I do believe Anthropic knew this and this kind of involvement is still beneficial to them, as it still slows down competition, which is their sole objective when you brush off marketing sprinkles from their statements.
Safety testing of frontier LLMs does not slow down the competition any more than it slows down Anthropic. It does prevent them from doing dumb things like releasing zero day buttons into the wild, though.
Yeah, you're right. I guess there are other sites, though, where you might feel more at home. Maybe explore a few of those when you're done insulting the people here who told you exactly what was going to happen.
In the meantime, there are a lot of classic morality tales in which deals with the Devil don't work out quite the way the protagonists hoped they would. These stories go back thousands of years, spanning a wide range of creeds, faiths, and cultures. You (and Dario) have some offline reading to do.
I’d suggest you use an LLM to assist you with comprehending their statement. It’ll do a better job, or at the very least be more objective than you’re being now. You’ve misinterpreted the statement. That is not what they’re saying at all. Please actually read instead of skimming until you find something that you believe reinforces your worldview.
Reading comprehension failure on display here from maxall4.
They are saying that comparison to other models only about the problems it was jailbroken to complete in the government's example, not all vulnerabilities it could exploit unjailbroken.