Uncharitably, I think this is a strategy to gorge further especially if they select for higher quality open source. They are embracing the best to train off iteration patterns of the best, and have a semi self correcting slop mechanism.
Charitably this will be great for open source software so... so long as they never moat up and lockdown.
Can't they just keep scraping these repositories for new data anyway? Or has that changed?