Hacker News

JimDabell 3 hours ago [ - ]

> I'd like to understand why I can't use a song in one of my videos without permission/payment, but an AI company can train models using that song without having either.

Because when you say you are “using” the song, what you mean is that you are distributing copies of the song, which is protected by copyright.

When AI companies train on the song, the model is learning from it. Outside of the rare cases of memorisation, this is not distributing copies and so copyright doesn’t have any say in the matter.

Learning isn’t copying, so copyright doesn’t get involved at all.

LatencyKills 3 hours ago [ - ]

I appreciate your comment, but you answered as if this question had been answered legally. It has not.

The New York Times is suing both OpenAI and Microsoft for copyright infringement. The Authors Guild is suing OpenAI. Getty Images is suing Stability AI. Disney is suing Midjourney. Universal Music Group and Sony have filed suits against multiple AI companies.

> so copyright doesn’t get involved at all.

The dozens of ongoing cases that discredit that statement.

JimDabell 3 hours ago [ - ]

Which statement of mine do you think is not settled law? Which law do you think is being broken and how?

Your objection doesn’t make sense. In the event that an AI company loses a lawsuit for copyright infringement based on simply training on copyrighted works, the answer to you saying you’d like to understand why they can do it and you can’t is simply “your premise is wrong; neither of you can”.

LatencyKills 3 hours ago [ - ]

> Which statement of mine do you think is not settled law?

I object to your statement that "copyright doesn’t get involved at all" when that is objectively untrue. If that was true, many of the world's largest companies wouldn't be spending tens of millions of dollars to have that question answered in court. Go to any law-focused forum, and you will find attorneys arguing over these questions.

To train a model using a book, you must first obtain a copy of that book. Did OpenAI purchase a copy of every book not already in the public domain used during training? They did not.

Some of the suits I mentioned claim that OpenAI literally stole copies of books to train its models.

My point is that the copyright question has not been answered. If the NYT, et. al. win, it will be a watershed moment for how AI companies pay for training data moving forward.