> I've hit this point with AI where it's not a simple process, but a long drawn out back and forth.
In my experience, even on a relatively trivial task, you can ask an LLM at least 20 times:
Is this actually done, or only partially implemented? Did you finish x, y, z?
And the LLM will say, no, I'm not done and keep working.
After that, I'll feed the branch to a different LLM, and ask if the implementation matched the design, where it's weak and needs improvements.
Same thing - that feedback will usually only be partially finished for several rounds.
When they all agree it's done - I'll finally look at the code, and there's still typically glaringly obvious problems - duplicate systems that reinvent the wheel, etc - that will take typically more than one prompt to get right...
Getting things right takes almost ~100x as long as getting things almost right with LLMs.
You can tell an LLM to "make me Rust, but easier. Make no mistakes," and it'll plan out a 100 commit process and get something that - somehow - sort of works... but isn't even close to complete.
Still, on a cost basis, you're still able to get features that would take yourself several times longer and cost orders of magnitude more money, and - if you're doing it right - they'll probably do a better job than you would've done (at least for me).
This is where the human element is critical, but cause it'll infinite loop review feedback if you let it and the code will easily go off the rails into an over engineered mess. That's why I review the code before/after as well as review the actual feedback itself - and often give the feedback to different AI to get its opinion as the other AI doesn't have a vested interest in it and can be more critical. At some point though you do have to cut them off and ship.