Agents can already do the review by themselves. I'd be surprised they review all of the code by hand. They probably can't mention it due to the regulatory of the field itself. But from what I have seen agentic review tools are already between 80th and 90th percentile. Out of randomly picked 10 engineers, it will provide more useful comments than most engineers.

the problem with LLM code review is that it's good at checking local consistency and minor bugs, but it generally can't tell you if you are solving the wrong problem or if your approach is a bad one for non-technical reasons.

This is an enormous drawback and makes LLM code review more akin to a linter at the moment.

I mean if the model can reason about making the changes on the large-scale repository then this implies it can also reason about the change somebody else did, no? I kinda agree and disagree with you at the same time, which is why I said most of the engineers but I believe we are heading towards the model being able to completely autonomously write and review its own changes.

There's a good chance that in the long run LLMs can become good at this, but this would require them e.g. being plugged into the meetings and so on that led to a particular feature request. To be a good software engineer, you need all the inputs that software engineers get.

If you read thoroughly through Stripe blog, you will see that they feed their model already with this or similar type of information. Being plugged into the meetings might just mean feed the model with the meeting minutes or let the model listen to the meeting and transcribe the meeting. It seems to me that both of them are possible even as of today.