That's interesting, though you have to imagine the data set is very low quality on average and distilling high quality training pairs out of it is very costly.

Hence exponential increase in model training costs. Also hallucinations in the long tail of knowledge.