I'm not whining in this case, just pointing out "they gave it out for free" is completely false, at the very least for the GNU types. It was always meant to come with plenty of strings attached, and when those strings were dodged new strings were added (GPL3, AGPL).

If I had a photographic memory and I used it to replicate parts of GPLed software verbatim while erasing the license, I could not excuse it in court that I simply "learned from" the examples.

Some companies outright bar their employees from reading GPLed code because they see it as too high of a liability. But if a computer does it, then suddenly it is a-ok. Apparently according to the courts too.

If you're going to allow copyright laundering, at least allow it for both humans and computers. It's only fair.

> If I had a photographic memory and I used it to replicate parts of GPLed software verbatim while erasing the license, I could not excuse it in court that I simply "learned from" the examples.

Right, because you would have done more than learning, you would have then gone past learning and used that learning to reproduce the work.

It works exactly the same for a LLM. Training the model on content you have legal access to is fine. Aftwards, somone using that model to produce a replica of that content is engaged in copyright enfringement.

You seem set on conflating the act of learning with the act of reproduction. You are allowed to learn from copyrighted works you have legal access to, you just aren't allowed to duplicate those works.

The problem is that it's not the user of the LLM doing the reproduction, the LLM provider is. The tokens the LLM is spitting out are coming from the LLM provider. It is the provider that is reproducing the code.

If someone hires me to write some code, and I give them GPLed code (without telling them it is GPLed), I'm the one who broke the license, not them.

> The problem is that it's not the user of the LLM doing the reproduction, the LLM provider is.

I don't think this is legally true. The law isn't fully settled here, but things seem to be moving towards the LLM user being the holder of the copyright of any work produced by that user prompting the LLM. It seems like this would also place the enfringement onus on the user, not the provider.

> If someone hires me to write some code, and I give them GPLed code (without telling them it is GPLed), I'm the one who broke the license, not them.

If you produce code using a LLM, you (probably) own the copyright. If that code is already GPL'd, you would be the one engaged in enfringement.

[flagged]

[flagged]

> You seem set on conflating "training" an LLM with "learning" by a human.

"Learning" is an established word for this, happy to stick with "training" if that helps your comprehension.

> LLMs don't "learn" but they _do_ in some cases, faithfully regurgitate what they have been trained on.

> Legally, we call that "making a copy."

Yes, when you use a LLM to make a copy .. that is making a copy.

When you train a LLM... That isn't making a copy, that is training. No copy is created until output is generated that contains a copy.

Everything which is able to learn is also alive, and we don't want to start to treat digital device and software as living beings.

If we are saying that the LLM learns things and then made the copy, then the LLM made the crime and should receive the legal punishment and be sent to jail, banning it from society until it is deemed safe to return. It is not like the installed copy is some child spawn from digital DNA and thus the parent continue to roam while the child get sent to jail. If we are to treat it like a living being that learns things, then every copy and every version is part of the same individual and thus the whole individual get sent to jail. No copy is created when installed on a new device.

> we don't want to start to treat digital device and software as living beings.

Right, because then we have to decide at what point our use of AI becomes slavery.

[flagged]

[flagged]

[flagged]

You both broke the site guidelines badly in this thread. Could you please review https://news.ycombinator.com/newsguidelines.html and stick to the rules? We ban accounts that won't, and I don't want to ban either of you.

[flagged]

You both broke the site guidelines badly in this thread. Could you please review https://news.ycombinator.com/newsguidelines.html and stick to the rules? We ban accounts that won't, and I don't want to ban either of you.

I'm polite in repose to being repeatedly called names and this is your response?

If you think my behavior here was truly ban worthy than do it because I don't see anything in the I would change except for engaging at all

This is the sort of thing I was referring to:

> Instead of bothering to read and understand you have continued to call names.

> You seemed confused, you still seem confused

> your pointless semantic nitpick

> you need to get some more real world experience

I wouldn't personally call that being polite, but whatever we call it, it's certainly against HN's rules, and that's what matters.

Edit: This may or may not be helpful (probably not!) but I wonder if you might be experiencing the "objects in the mirror are closer than they appear" phenomenon that shows up pretty often on the internet - that is, we tend to underestimate the provocation in our own comments, and overestimate the provocation in others' comments, which in the end produces quite a skew (https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...).

Sorry, and thanks.

I know moderation is a tough gig.