I built a more naive version for our team using Copilot and GitHub actions and it works quite well (wish I had metrics too). The team loves it.
The ROI here is so high that I don't mind using the strongest model available for the actual code review. I don't trust Sonnet and such. Just let Opus or GPT 5.5 do the whole thing and pay a bit more for less complexity.
I did similarly with copilot.
I have about 15 or so subagents doing reviews from different perspectives (or providing some additional value, like finding agents.md files, doing confidence ranking, describing images attached to the PR, that get validated later on with Jira issue description).
I used it since about November, with large scale popularity in my company reaching in April - all that on a 300 premium requests (because they allowed starting subagents, and there was no limit how long a single request can last) - so it would cost something like $5000 and $8000 for April and May if it was API pricing. I had similar cost per review (about $0.90) with Opus 4.6 and help from Sonnet and Haiku for simpler tasks. It did about 4000 reviews during the last 2 months.
And starting in June, it will be dead because it will be API pricing and for $30 (or $19 since September) it will do just few reviews.
A fun project.
do you also have separate prompts for each domain (security, architecture etc?).
would love to look into it if any part of it is open source
[dead]