If that ends up being true, GPT5.5 at 70 (and presumably Fable a bit ahead of that) is still in a different league, which was partly my point. To listen to online chatter, GLM5.2 is a tectonic shift in the landscape. In reality, it's just interesting. Probably safe to bet once the DeepSWE benches all get fully updated it won't even be on the pareto frontier.
I'm not accusing anyone specifically, but I've noticed Chinese bots swamping certain YouTube channels that, for example, cover US defense industry news. They'll downplay any and all technical advances, play up China's dominance, US cowardice, etc. All very transparent. I suspect some of the online conversation about open Chinese models is driven by that. How often do you see people talking about Mistral or Trinity? Never. Because they don't play that game.
There are definitely some Chinese bots + actual people (imagine that!) who like to talk up Chinese models, I'm one of them but I like to find out how good these models really are before saying anything.
GLM definitely isn't opus level yet but it's for sure good. I think it lacks some knowledge (when coding) that the frontier models possess, which is expected given that the model is probably quite small when compared to the frontier.
But people don't say much about Mistral, probably because they are nowhere as good.. And they don't have large population behind them to actually use them.
If that ends up being true, GPT5.5 at 70 (and presumably Fable a bit ahead of that) is still in a different league, which was partly my point. To listen to online chatter, GLM5.2 is a tectonic shift in the landscape. In reality, it's just interesting. Probably safe to bet once the DeepSWE benches all get fully updated it won't even be on the pareto frontier.
I'm not accusing anyone specifically, but I've noticed Chinese bots swamping certain YouTube channels that, for example, cover US defense industry news. They'll downplay any and all technical advances, play up China's dominance, US cowardice, etc. All very transparent. I suspect some of the online conversation about open Chinese models is driven by that. How often do you see people talking about Mistral or Trinity? Never. Because they don't play that game.
There are definitely some Chinese bots + actual people (imagine that!) who like to talk up Chinese models, I'm one of them but I like to find out how good these models really are before saying anything.
GLM definitely isn't opus level yet but it's for sure good. I think it lacks some knowledge (when coding) that the frontier models possess, which is expected given that the model is probably quite small when compared to the frontier.
But people don't say much about Mistral, probably because they are nowhere as good.. And they don't have large population behind them to actually use them.