How much of the training set comes from websites with "no automated scraping" in their terms?