> If you don't have an essentially infinite set to draw your validation data from then a large enough model will memorize it as part of its developer teams KPIs.

Sounds like a use-case for property testing: https://en.wikipedia.org/wiki/Software_testing#Property_test...