I think what this model actually showed is a cyclical aspect of tokens as a commodity
It is based on supply and demand of GPUs, the demand currently outstrips supply, while the 'frontier models' are also much more computationally efficient than last year's models in some ways - using far fewer computational resources to do the same thing
so now that everyone wants to use frontier models in "agentic mode" with reasoning eating up a ton more tokens before sticking with a result, the demand is outpacing supply but it is possible it equalizes yet again, before the cycle begins anew