What I don't like about this approach is that it mainly improves the chances of zero-shotting a feature, but I require a ping pong with the LLM to iterate on the code/approach. Not sure how to parallelize that, I'm not gonna keep the mental model of 4+ iterations of code in my head and iterate on all of them.

For visual UI iteration this seems amazing given the right tooling, as the author states.

I could see it maybe useful for TDD. Let four agents run on a test file and implement until it passes. Restrict to 50 iterations per agent, first one that passes the test terminates other in-progress sessions. Rinse and repeat.

> but I require a ping pong with the LLM to iterate on the code/approach

I've never got results from any LLM when doing more than one-shots. I basically have a copy-pastable prompt, and if the first answer is wrong, I update the prompt and begin from scratch. Usually I add in some "macro" magic too to automatically run shell commands and what not.

It seems like they lose "touch" with what's important so quickly, and manages to steer themselves further away if anything incorrect ends up at any place in the context. Which, thinking about how they work, sort of makes sense.

That doesn't take away from the OP's point (and OP didn't specify what ping ponging looks like, could be the same as you're describing), you are still iterating based on the results, and updating the prompt based on issues you see in the result. It grates on a human to switch back and forth between those attempts.

But if you're "starting from scratch", then what would be the problem? If none of the results match what you want, you reiterate on your prompt and start from scratch. If one of them is suitable you take it. If there's no iterating on the code with the agents, then this really wouldn't add much mental overhead? You just have to glance over more results.

I usually see that results are worse after ping-pong. If one-shot doesn't do it better to "re-roll". Context window full of crap poisons its ability to do better and stay on target.

I guess one way it might be able to work is with a manager agent, who delegates to IC agents to try different attempts. The manager reviews their work and understands the differences in what they are doing, and can communicate with you about it and then to the ICs doing the work. So you are like a client who has a point of contact at an engineering org who internally is managing how the project is being completed.

>From the post: There is no easy way to send the same prompt to multiple agents at once. For instance, if all agents are stuck on the same misunderstanding of the requirements, I have to copy-paste the clarification into each session.

It's not just about zero shotting. You should be able to ping pong back and forth with all of the parallel agents at the same time. Every prompt is a dice roll, so you may as well roll as many as possible.

> Every prompt is a dice roll, so you may as well roll as many as possible.

Same vibe as the Datacenter Scale xkcd -> https://xkcd.com/1737/

I write docs often and what works wonders with LLM is good docs. A readme a architectural doc etc.

Helps me to plan it well and the LLM to work a lot better

Bonus! Future you and other devs working in the system will benefit from docs as well.

Yah it's not really usable for iteration. I don't parallelize this way. I parallelize based on functions. Different agents for different function.

Meanwhile, a huge problem in parallelization is maintaining memory-banks, like https://docs.cline.bot/prompting/cline-memory-bank