> With Opus you can give it a long-horizon task (eg build an entire feature) and it will plan it out and implement it and almost always stay on task. This is what people mean when they say "agentic tasks"
I've had no trouble getting the current generation of smaller models to do the same thing. Maybe it's more of a harness issue than a model issue?
Recently I've used both MiniMax M3 and DeepSeek V4 Flash to one-shot moderately complex applications from a written spec, and neither one got lost along the way