I don’t know if you’ve tried to scrape or programmatically download a lot of websites recently! It’s not possible to repeat their data collection process anymore.
I don’t know if you’ve tried to scrape or programmatically download a lot of websites recently! It’s not possible to repeat their data collection process anymore.
maybe i'm just pedantic. it's possible you could only build models like these from scratch until a few years ago for that reason, but isn't that an (illegal,unethical) early mover advantage?
to me ladder pulling would be:
- web scraping for model training becomes illegal, with heavy punitive penalties
- training models above a certain compute threshold requires government licensing
- expensive third-party audits are required before deploying models above a capability threshold