Number of parameters is at least a proxy for model capability.
You can achieve incredible tok/dollar or tok/sec with Qwen3 0.6b.
It just won't be very good for most use cases.
Number of parameters is at least a proxy for model capability.
You can achieve incredible tok/dollar or tok/sec with Qwen3 0.6b.
It just won't be very good for most use cases.
Model capability is the other axis on their chart. So they could have put Qwen 0.6b there, it would be in the bottom right corner.
I know what they are trying to do. They are attempting show a kind of pareto frontier but it’s a little awkward.