I like to think about maximizing throughput while minimizing attention: both matter, and the proposal here is expensive on my attention. Optimizing per-task latency matters less than enabling longer non-interactive runs.

For parallelism, I'm finding it more productive to have multiple distinct tasks that I multitask on and guide each to completion. Along the way I improve the repo docs and tools so the AI is more self-sufficient the next time, so my energy goes more to enabling longer runs.

Ex: One worker improving all docs. I can come back, give feedback, and redo all of them. If I'm going to mess with optimizing agent flows, it'd be to make the repo style guide clearer to the AI. In theory I can divide docs sections and manually run sections in parallel, or ask for multiple parallel versions of it all for comparison... But that's a lot of overhead. Instead, I can fork the repo and work another another non-docs issue in parallel. A. Individual task is slow, but I get more tasks done, and with less human effort.

I'd like tools to automate fork/join parallelism for divide-and-conquer plans, and that feels inevitable. For now, they do fairly linear CoT, and easier for me to do distinct tasks vs worrying about coordinating.