Hacker News

for reference, it's the 2nd fastest model tracked in the "Highlights" section of https://artificialanalysis.ai/

Yes, it's incredibly fast. Openrouter is clocking 60 tokens per second, which is on par with the likes of sonnet, opus, GPT 5.5.

That section misses Cerebras and Groq which are up to 5x faster.

Very different tech and limitations though so wouldn’t make sense to compare 1:1 I think

What are the limitations ?

Much smaller context