composer is competitive with around opus 4.5 in feeling?
largely lags behind opus4.7/gpt5.4, but is respectable, and generally outperforms the glm/qwen equivalents anecdotally despite benchmarks.
fails to follow instructions more often, and is less code critical, but performs okay if you can decompose the task to smaller problem spaces. i.e. only do manual review, only do typechecking, only do specific component. etc
https://artificialanalysis.ai/agents/coding-agents?coding-ag...
I agree, Composer 2.5 is really good. I use it for all kinds of small tasks, and really for any kind of first pass at debugging, answering questions about the codebase, pulling data for reports, etc. It’s fast, pretty accurate, and basically free.