TDD approach could play the RL role.

But what makes you think the ai generated tests will correctly represent the problem at hand?