The same way they already do with humans coders whose unit tests were developed by exactly same flawed processes:

Mediocrely.

Sometimes the current process works, other times the planes fall out of the sky, or updates causes millions of computers to blue screen on startup at the same time.

LLMs in particular, and AI in general, doesn't need to beat humans at the same tasks.