my guess is they get high quality training data.

This is correct. The most valuable form of data for any AI company is corrective feedback from real use cases.