Agree the per-task capability hasn't been the blocker for a while. But on the autonomous-loop question — in my experience that's not gated by how good the model is on any single step. What kills the loop is it slowly losing the constraints from earlier in the run and walking back decisions you'd already settled.