As Musk and Dorsey have said, IP law is highly incompatible with AI.
Wasn't there some news a while ago that Anthropic and other frontier model companies used a bunch of pirated books to train their models? Are we not all benefiting from the fact that they also crawled a bunch of open code repos?
If something is open source, it's pretty easy to tell if code is pulled directly from another repo and included in a project. It's much harder to know if whatever model was building something pulled from it (through training or simply searching online).
> Wasn't there some news a while ago that Anthropic and other frontier model companies used a bunch of pirated books to train their models? Are we not all benefiting from the fact that they also crawled a bunch of open code repos?
It was Meta. With Zuck's explicit permission.
& Anthropic owes $1.5b
https://apnews.com/article/anthropic-authors-copyright-judge...