This is super exciting. I've been poking at it today, and it definitely changes my workflow -- I feel like a full three or four hour parallel coding session with subagents is now generally fitting into a single master session.

The stats claim Opus at 1M is about like 5.4 at 256k -- these needle long context tests don't always go with quality reasoning ability sadly -- but this is still a significant improvement, and I haven't seen dramatic falloff in my tests, unlike q4 '25 models.

p.s. what's up with sonnet 4.5 getting comparatively better as context got longer?

Did it get better? I used sonnet 4.5 1m frequently and my impression was that it was around the same performance but a hell of a lot faster since the 1m model was willing to spends more tokens at each step vs preferring more token-cautious tool calls.

Opus 4.6 is wayy better than sonnet 4.5 for sure.

Random: are you personally paying for Claude Code or is it paid by you employer?

My employer only pays for GitHub copilot extension

GitHub Copilot CLI lets you use all these models (unless your employer disables them.

https://github.com/features/copilot/cli

Disclosure: work at Msft

Used Claude through copilot for so long before switching to CC. Even for the same model the difference is shocking. Copilot’s harness and the underlying Claude models are not well-matched compared to the vertically-integrated Claude Code harness.

Disclosure: have to use them via copilot at work. Be glad I don’t write code for nuclear plants. Why does it have to be so hard. Doubly so in JetBrains ides but I’ve a feeling that’s on both of you rather than just you personally. But I still resent you now.

Both. Employer pays for work max 20x, i pay for a personal 10x for my side projects and personal stuff.