Hacker News

Generative AI changed the equation so much that our existing copyright laws are simply out of date.

Even copyright laws with provisions for machine learning were written when that meant tangential things like ranking algorithms or training of task-specific models that couldn't directly compete with all of their source material.

For code it also completely changes where the human-provided value is. Copyright protects specific expressions of an idea, but we can auto-generate the expressions now (and the LLM indirection messes up what "derived work" means). Protecting the ideas that guided the generation process is a much harder problem (we have patents for that and it's a mess).

It's also a strategic problem for GNU. GNU's goal isn't licensing per se, but giving users freedom to control their software. Licensing was just a clever tool that repurposed the copyright law to make the freedoms GNU wanted somewhat legally enforceable. When it's so easy to launder code's license now, it stops being an effective tool.

GNU's licensing strategy also depended on a scarcity of code (contribute to GCC, because writing a whole compiler from scratch is too hard). That hasn't worked well for a while due to permissive OSS already reducing scarcity, but gen AI is the final nail in the coffin.

> “Changing the equation” by boldly breaking the law.

Is it? I think the law is truly undeveloped when it comes to language models and their output.

As a purely human example, suppose I once long ago read through the source code of GCC. Does this mean that every compiler I write henceforth must be GPL-licensed, even if the code looks nothing like GCC code?

There's obviously some sliding scale. If I happen to commit lines that exactly replicate GCC then the presumption will be that I copied the work, even if the copying was unconscious. On the other hand, if I've learned from GCC and code with that knowledge, then there's no copyright-attaching copy going on.

We could analogize this to LLMs: instructions to copy a work would certainly be a copy, but an ostensibly independent replication would be a copy only if the work product had significant similarities to the original beyond the minimum necessary for function.

However, this is intuitively uncomfortable. Mechanical translation of a training corpus to model weights doesn't really feel like "learning," and an LLM can't even pinky-promise to not copy. It might still be the most reasonable legal outcome nonetheless.

pocksuppet 12 hours ago [ - ]

It's not a problem. If you give a work to an AI and say "rewrite this", you created a derivative work. If you don't give a work to an AI and say "write a program that does (whatever the original code does)" then you didn't. During discovery the original author will get to see the rewriter's Claude logs and see which one it is. If the rewriter deleted their Claude logs during the lawsuit they go to jail. If the rewriter deleted their Claude logs before the lawsuit the court interprets which is more likely based on the evidence.

hennell 9 hours ago [ - ]

But the AI has the work to derive from already. I just went to Gemini and said "make me a picture of a cartoon plumber for a game design". Based on your logic the image it made me of a tubby character with a red cap, blue dungarees, red top and a big bushy mustache is not a derivative work...

(interestingly asking it to make him some friends it gave me more 'original' ideas, but asking it to give him a brother and I can hear the big N's lawyers writing a letter already...)

buckle8017 10 hours ago [ - ]

Except Claude was for sure trained on the original work and when asked to produce a new product that does the same thing will just spit out a (near) copy

umvi 9 hours ago [ - ]

Ok, but what if in the future I could guarantee that my generative model was not trained on the work I want to replicate. Like say X library is the only library in town for some task, but it has a restrictive license. Can I use a model that was guaranteed not trained on X to generate a new library Z that competes with X with a more permissive license? What if someone looks and finds a lot of similarities?

vunderba 7 hours ago [ - ]

This is what Adobe ostensibly is trying to do with their GenAI image model, Firefly.

https://en.wikipedia.org/wiki/Adobe_Firefly

buckle8017 9 hours ago [ - ]

I wish you luck proving it wasn't trained on the original library or any work that infringed itself.

airforce1 8 hours ago [ - ]

I think there could be a market for "permissive/open models" in the future where a company specifically makes LLM models that are trained on a large corpus of public domain or permissively licensed text/code only and you can prove it by downloading the corpus yourself and reproducing the exact same model if desired. Proving that all MIT licensed code is non-infringing is probably impossible though at that point copyright law is meaningless because everyone would be in violation if you dig deep enough.

leecommamichael 11 hours ago [ - ]

“Changing the equation” by boldly breaking the law.

Majromax 7 hours ago [ - ]

leecommamichael 5 hours ago [ - ]

Non-sequitur. It can be both.

mlinhares 9 hours ago [ - ]

Its only breaking the law if you don't have enough money to pay the politicians.

empath75 11 hours ago [ - ]

> Generative AI changed the equation so much that our existing copyright laws are simply out of date.

Ideas are not protected by copyright, expression of ideas is.

You can't legally copy a creative work, but you can describe the idea of the work to an AI and get a new expression of it in a fraction of the time it took for the original creator to express their idea.

The whole premise of copyright is that ideas aren't the hard part, the work of bringing that idea to fruition is, but that may no longer be true!

satvikpendem 11 hours ago [ - ]

Honestly, good. Copyright and IP law in general have been so twisted by corporations that only they benefit now, see Mickey Mouse laws by Disney for example, or patenting obvious things like Nintendo or even just patent trolling in general.

hamdingers 7 hours ago [ - ]

The biggest recording artist in the world right now had to re-record her early albums because she didn't own the copyright, imagine how many artists don't get that big and never have that opportunity.

That individual artists are still defending this system is baffling to me.

10 hours ago [ - ]

[deleted]

ajross 11 hours ago [ - ]

> GNU's goal isn't licensing per se, but giving users freedom to control their software.

I think that's maybe misunderstanding. GNU wants everyone to be able to use their computers for the purposes they want, and software is the focus because software was the bottleneck. A world where software is free to create by anyone is a GNU utopia, not a problem.

Obviously the bigger problem for GNU isn't software, which was pretty nicely commoditized already by the FOSS-ate-the-world era of two decades ago; it's restricted hardware, something that AI doesn't (yet?) speak to.