I think people who run 15 agents to write a piece of software could probably use 1 or 2 and a better multi-page prompt and have the same results for a fraction of the cost.
Especially with the latest models which pack quite a long and meaningful horizon into a single session, if you prompt diligently for what exactly you want it to do. Modern agentic coding spins up its own sub-agents when it makes sense to parallelize.
It's just not as sexy as typing a sentence and letting your AI bill go BRR (and then talking about it).
I'd like to see some actual results with a meaningful benchmark of software output that shows that agent orchestrators accomplish any meaningful improvement in the state of the art of software engineering, other than spending more tokens.
Maybe it's time to dredge up the Mythical Man-Month?