they're also fast.
So are Gemini Flash (Lite) and GPT mini/nano.
- 1100 tokens/second Mistral Flash Answers https://www.youtube.com/watch?v=CC_F2umJH58 - 189.9 tokens/second Gemini 2.5 Flash Lite https://openrouter.ai/google/gemini-2.5-flash-lite - 45.92 tokens/second GPT-5 Nano https://openrouter.ai/openai/gpt-5-nano - 1799 tokens/second gpt-oss-120b (via Cerebras) https://openrouter.ai/openai/gpt-oss-120b - 666.8 tokens/second Qwen3 235B A22B Thinking 2507 (via Cerebras) https://openrouter.ai/qwen/qwen3-235b-a22b-thinking-2507
That being said, I can not find non-marketing numbers for Mistral Flash Answers. Real-world tps are likely lower, so this comparison chart is not very fair.
So are Gemini Flash (Lite) and GPT mini/nano.
That being said, I can not find non-marketing numbers for Mistral Flash Answers. Real-world tps are likely lower, so this comparison chart is not very fair.