That is not how training works…

That's how model distillation works.

DeepSeek is the most notable case, but it's been used lots.

And the foundation model companies are scraping and exfiltrating each others' data.