"They could even use an LLM to detect if the data has been poisoned."

And for extra safety, you can add another LLM agent who checks on the first .. and so on. Infinite safety! s/

People already do this with multi agent workflows. I kind of do this with local models, I get a smaller model to do the hard work for speed and use a bigger model to check its work and improve it.

The tech surely has lots of potential, but my point was just, that self improvement does not really work yet unsupervised.