This is kind of bogus since some of the S and A tier models are pretty useless for reasoning or tool calls and can’t run with any sizable system prompt… it seems to be solely based on tokens per second?
This is kind of bogus since some of the S and A tier models are pretty useless for reasoning or tool calls and can’t run with any sizable system prompt… it seems to be solely based on tokens per second?