Hacker News

Sincere question: Has anyone figured out how we're going to code review the output of an agent fleet?

Insincere answer that will probably be attempted sincerely nonetheless: throw even more agents at the problem by having them do code review as well. The solution to problems caused by AI is always more AI.

regularfry 4 days ago [ - ]

Technically that's known as "LLM-as-judge" and it's all over the literature. The intuition would be that the capability to choose between two candidates doesn't exactly overlap with the ability to generate either one of them from scratch. It's a bit like how (half of) generative adversarial networks work.

brookst 4 days ago [ - ]

s/AI/tech

sensanaty 4 days ago [ - ]

Most of the people pushing this want to just sell an MVP and get a big exit before everything collapses, so code review is irrelevant.

lsllc 4 days ago [ - ]

Simple, just ask an(other) AI! But seriously, different models are better/worse at different tasks, so if you can figure out which model is best at evaluating changes, use that for the review.

phamilton 4 days ago [ - ]

I suspect this will indeed be part of it, but it won't work with today's AIs on today's codebases.

Models will improve, but also I predict code style and architecture will evolve towards something easier for machine review.

nchmy 4 days ago [ - ]

sincere question: why would you not be able to code review it in the same way you would for humans?

phamilton 4 days ago [ - ]

Agents could generate more PRs in a weekend than my team could code review in a month.

Initially we can absolutely just review them like any other PR, but at some point code review will be the bottleneck.

fxtentacle 4 days ago [ - ]

You just don't. Choose randomly and then try to quickly sell the company. /s