> You are likely to get better results if you do not use the same model for review that wrote the code
There’s no evidence of this. I guess you are anthropomorphising models (i.e., it’s good that - different human reviews your code)
> You are likely to get better results if you do not use the same model for review that wrote the code
There’s no evidence of this. I guess you are anthropomorphising models (i.e., it’s good that - different human reviews your code)
There is some evidence.[1] The best reviewer is a different model with fresh context, worst is same model with same context.
1. https://arxiv.org/pdf/2603.04582
Yeah, one model over another seems to matter less, they respond differently to the same prompts, so if anything, I'd use multiple prompts over choosing one model over another.
However, using two models to generate two reviews easily beats doing one model and one review, as some models seem to "care" more about certain things, but you'll just miss different things if you change the model rather than add more.
well they are different. human or not. so it makes sense to get it reviewing by "something" different that one that wrote code.