Lots of releases but very little actual performance increases

Sonnet and Gemini saw fairly substantial perf increases recenly

Love Sonnet but 3.7 is not obviously an improvement over 3.5 in my real world usage. Gemini 2.5 pro is great, has replaced most others for me (Grok I use for things that require realtime answers)

Are you comparing it with or without thinking? I'd say it's a fairly big improvement in long thinking mode.

It does a lot better on philosophy questions.