Isn't that the purpose of red/green refactoring though? To establish working software that expresses regression, and builds trust (in the software)?
If your premise is that people would shift to using AI to write tests they don't understand, then that's not necessarily a failing of the contributor.
The contributor might not understand the output, but the maintainer would be able to critique a spec file and determine pretty quickly if implementation would be worthwhile.
This would necessitate a need for small tickets, thereby creating small spec files, and easier review by maintainers.
Also, any PR that included a non spec file could be dismissed patently.
It is possible for users of AI to learn from reading specs.
But if agents are doing the entire thing (reading the ticket, generating the PR, submitting the PR)...then the point of people not understanding is moot.
From my experience, you can't trust the agent to do the entire thing unless you set up very heavy linters, quality control systems (e.g. SonarQube) and a long etc. of things because AI tends to produce pretty bad code: repetition, unused code, lack of structure... basically all the things that we've spent decades learning not to do. And then there is the point where you get a pretty obscure bug that you can only solve if you have a deep understanding of the code which you won't have because you delegated that to an agent.
I like agentic programming, I use it, but I review everything that the agent does and frequently spend a few cycles simply telling the agent to refactor the code because it constantly produces technical debt.