I’m not understanding. If cost per token hits the floor that does not mean that you want a model that uses tokens.
If the Chinese are optimizing for token usage, that’s also speed.
Why use more token if few do trick?
I’m not understanding. If cost per token hits the floor that does not mean that you want a model that uses tokens.
If the Chinese are optimizing for token usage, that’s also speed.
Why use more token if few do trick?