I think they split the codebase in smaller files or modules and then tell the AI there's a bug in this particular file and to go find it.

Then they loop over a codebase like this. This way you always point a model at a 'known' bug. And I assume a smaller context window helps with quality.

Not entirely sure it's obviously proprietary.