> Bigger is not better
The article uses the example of GLM being smaller than DeepSeek, yet better on hallucinations as "smaller can be good too"
But the GLM family itself is scaling up fast: GLM-5.x family is 754B, double the previous generation of GLM-4.x
> comes within just 4 points of GPT-5.5 and 9 points of Fable 5
9 percentage points IS a big difference
If we're hand waiving how an open source model from a Chinese lab that you can use a nearly unlimited amount for <100/mo's 9% difference from the premier, unavailable, expensive when it was available American frontier model, we've already lost.