On one hand, that's sort of true for practical uses - and benchmarks notoriously undercount multi-turn settings.
On another, being able to reliably tackle minor tasks with no handholding is very valuable in itself. Sometimes implementation details are important, but often, the most important thing is to Get It Done.