How do you prevent it from increasing scope?
That's the main issue I've found from running loops like this. Each loop has ~7 agents, say, looking through different lenses (security, UX, performance, etc.). Each one notes a few issues, each issue gets fixed, you do 5 to 8 loops, as you say. Each individual item that gets fixed looks minor but when you add it all up at the end you've increased PR size and scope significantly.
That is such a good point.
I recently opened a PR against this AI personal finance tool Ray https://github.com/cdinnison/ray-finance/pull/8 to add an Apple Card import feature, since Apple Card is not supported by Plaid.
I built the manual import feature, opened the PR, and then ran a code review.
What I hadn't thought about when I built the feature, was the myriad ways that the implications of importing data from Apple would have to be considered and integrated into the rest of the app, for the manual import to be a first-class feature, not "just a manual import" of data.
I ended up running adamsreview against it like 5-10 times, before considering it complete, as I learned that there was much more to the integration than I realized.
Now is that necessarily a problem? Maybe not. I should have realized from the start that the import feature was going to much more than just a small feature. But at least, thanks to the review loop, I got it completely right before the PR was merged.
Yep, a few views here:
- one wave is code reduction via DRY removals and architectural fixes, and another is adverserial to get rid of false additions, so this helps AI bloat either way
- as the other comment says, underspecification is a problem, so this ends up finding when the implementation, tests, docs, quality guide, and spec are out of sync, with whichever to blame.
- Usable, well-designed, secure, and well-typed code ends up being bigger, so this helps cut to the chase. Ultimately, either you get there or you don't, and this helps cut review burden so you can do your part of it faster and at a higher level.
Funny enough, I'm now playing with gardening agents whose job it is to reduce code. But I wouldn't want to slow PRs on that so view as seperate PRs.