Nice in theory, hard in practice.

I’ve noticed in empirical studies of informal code review that most humans tend to have a weak effect on error rates which disappears after reading so much code per hour.

Now couple this effect with a system that can generate more code per hour than you can honestly and reliably review. It’s not a good combination.