Chip costs strongly impact the economics of model serving.
It is entirely plausible to me that Opus 4.7 is designed to consume more tokens in order to artificially reduce the API cost/token, thereby obscuring the true operating cost of the model.
I agree though, I chose poor phrasing originally. Better to say that GB200 vs Tranium could contribute to the efficiency differential.
probably the wrong take - they are arm racing to a better model. it's not enshittification era for models just yet
Models are still in arms race mode, but harnesses and subscription strategy are tiptoeing into their enshittification era.