That's what I understood from the top-level question, and it's my experience as well. If you don't review the LLM's code, it breaks very quickly. That's why the question for me isn't "how many agents can I run in parallel?", but "how many changes can I review in parallel?".
For me, that's "just one", and that's why LLM coding doesn't scale very far for me with these tools.