I appreciate your comment, but you answered as if this question had been answered legally. It has not.
The New York Times is suing both OpenAI and Microsoft for copyright infringement. The Authors Guild is suing OpenAI. Getty Images is suing Stability AI. Disney is suing Midjourney. Universal Music Group and Sony have filed suits against multiple AI companies.
> so copyright doesn’t get involved at all.
The dozens of ongoing cases that discredit that statement.
Which statement of mine do you think is not settled law? Which law do you think is being broken and how?
Your objection doesn’t make sense. In the event that an AI company loses a lawsuit for copyright infringement based on simply training on copyrighted works, the answer to you saying you’d like to understand why they can do it and you can’t is simply “your premise is wrong; neither of you can”.
> Which statement of mine do you think is not settled law?
I object to your statement that "copyright doesn’t get involved at all" when that is objectively untrue. If that was true, many of the world's largest companies wouldn't be spending tens of millions of dollars to have that question answered in court. Go to any law-focused forum, and you will find attorneys arguing over these questions.
To train a model using a book, you must first obtain a copy of that book. Did OpenAI purchase a copy of every book not already in the public domain used during training? They did not.
Some of the suits I mentioned claim that OpenAI literally stole copies of books to train its models.
My point is that the copyright question has not been answered. If the NYT, et. al. win, it will be a watershed moment for how AI companies pay for training data moving forward.