Are those with «thinking» or without?
Sonnet 3.7's 70% is without thinking, see https://www.anthropic.com/news/claude-3-7-sonnet
The thinking tokens (even just 1024) make a massive difference in real world tasks with 3.7 in my experience
based on their release cadence, I suspect that o4-mini will compete on price, performance, and context length with the rest of these models.
o4-mini, not to be confused with 4o-mini
With
Sonnet 3.7's 70% is without thinking, see https://www.anthropic.com/news/claude-3-7-sonnet
The thinking tokens (even just 1024) make a massive difference in real world tasks with 3.7 in my experience
based on their release cadence, I suspect that o4-mini will compete on price, performance, and context length with the rest of these models.
o4-mini, not to be confused with 4o-mini
With