This puts Sonnet 4.6 above Opus 4.6 in the coding index.. kinda hard to trust those numbers.

(Also it puts Opus 4.7 universally above Opus 4.6, and I may be wrong but this doesn't seem to match the experience of most/many/some people. I think it's widely recognized that Anthropic is severely lacking compute and Opus 4.7 is a costs saving measure)

What I’ve usually seen is 4.7 -> 4.5 -> 4.6 in terms of quality. Though 4.7 seems to hallucinate more than before.

Anthropic themselves have (had?) this thing where Opus is used for planning and Sonnet for coding.

I thought this was a costs saving measure: we plan with the frontier model / SOTA, then code with something cheaper.

But then, Anthropic employees don't have rate limits, right?