If that were true, a developer may own copyright over the source code, but nothing on the compiled binaries, and I could download practically all software available as compiled binaries and use for free.
Indeed a developer owns copyright over the source code and on the compiled binaries, because there is no expansion happening here but just a translation from one format into another, the kind of thing that has been ruled copyrightable since copyright exists. The same goes for translations from one human language into another, and anybody with knowledge of more than one language will be happy to acknowledge that translating is hard work. Even so, the translator does not hold copyright on the result, at best they can say they have created a derived work and it is the original author that continues to hold copyright.
Compilation and translation happen in a generic manner and does not rely on a mountain of other IP, it is really just a transformative tool that happens to do something useful, someone constructed it to be a very precise translation to the point that any mistakes in it are called bugs and we fix them to ensure the process stays deterministic. Translators try hard to 'get it right' too: to affect the intentions of the original author as little as possible.
When you use a model loaded up with noise or that you have trained exclusively on code that you actually wrote I think a strong case could be made that you own the copyright on that work product. But when you train that model on other people's work, especially without their consent or use a model that has been trained in that way you lose your right to call the output of that model yours.
You did not write it, and the transformative process requires terabytes of other people's IP and only a little bit by you.
As soon as you can prove that your contribution substantially outweighs the amount of IP contributed in total you would have a much stronger case.
>> No, that human owns the copyright on the prompt, not on the work product.
I think I may have misunderstood your original comment above. It seems intending to say:
No, that human owns the copyright on the prompt, not necessarily on the work product. The human may partially have copyright over the work product as well, "how much" being dependent on how much new creative expression from the human was involved vs that from others.
Both the compiler (in absence of inclusion of copyrighted libraries) and the LLM are considered to not add creative work and thus do not change copyright status of the works they transform.
You can consider the training set of the LLM or other AI model to be 3rd party libraries and the level of copyright from them applying to final output to be how much can be directly considered derivative, just as reading copyrighted code and being inspired by it does not pass that copyright to your work unless it's obviously derivative
>> You can consider the training set of the LLM or other AI model to be 3rd party libraries ...
I like this comparison -- training set as '3rd party libraries'. Except, of course, that the authors behind the training set may not have actually granted permission to use, whereas the 3rd party libraries usually have some permission by way of license.
>> Indeed a developer owns copyright over the source code and on the compiled binaries, because there is no expansion happening here but just a translation from one format into another ... does not rely on a mountain of other IP
... and, the license agreement of the compiler and libraries used / linked to practically always explicitly waive copyrights over the said non-mountain of IP.
>> As soon as you can prove that your contribution substantially outweighs the amount of IP contributed in total you would have a much stronger case.
... a much stronger case that you have a partial copyright over the work, which is now likely a derivative work. You still may not have a case that you own the copyright exclusively (or as the original article says, that your employer does).
> If that were true, a developer may own copyright over the source code, but nothing on the compiled binaries, and I could download practically all software available as compiled binaries and use for free.
If the compiled binaries (output) were produced by running the input (source code) over every program written, then sure.
But that's not what's happening with compilers, is it? The output of a prompt is dependent on copyrighted work of others every single time it is run.
The output of a compiler is not dependent on the copyright output of every other program.
1. The "every"ies in your comment are not to be taken literally either. :-)
>> If the compiled binaries (output) were produced by running the input (source code) over every program written, then sure.
2. More importantly, the above seems cyclically dependent on whether output from generative AI is deemed to be in public domain or not, which I consider is an open-ended issue as of now. It is not so 'sure' as yet. :-)
If you have a more recent citation referring to case law that states the opposite then that would be great but afaik this article reflects the current state of affairs.
The human using the tool creates a prompt, there is then an automatic transformation of the prompt into code. Such automatic transformation is generally accepted as not to create a new work (after all, anybody else inputting the same prompt would have a reasonable expectation of generating the same output modulo some noise due to versioning and possibly other local context).
Claud code and in general AI generated code does not at present create a new work. But the prompt, that part which you input may be sufficiently creative to warrant copyright protection.
In the US, the copyright office (as the article you link to says), has declined to define “meaningful” contribution. If you want to argue that the user doesn’t own it for incredibly trivial prompts, I won’t argue (though I consider that to be non-useful code).
Every developer I’ve seen use these tools has have engaged in a meaningful contribution: specific directions across multiple prompts, often (though not always) editing the code afterwards, manually running the code and promoting for changes, etc.
Until the courts, legislators, or the copyright office define something otherwise, I’m highly confident of my assertion. (Mostly because of the insane number of hours I’ve spent with counsel on this. And, as a disclaimer, since I am biased: I worked on Copilot and Google’s various AI assisted coding products as an SVP and VP.)
If my business depended on a legal fiction to be true and I had invested a whole pile of effort + money into it being so then I would argue at every opportunity that 'of course it is legal'. But that's just a version of fake-it-until-you-make-it and in practice not all of those bets pay off.
The fact that meaningful contribution has not been defined is a strong signal that things are not nearly as clear cut as you make them out to be. Until there is a ruling that clearly establishes that the person that generated the prompt owns the copyright on the code I think it is misleading to suggest that this is already the case, your lawyers are not the lawyers of the parties that will end up hurt if it ends up not being so.
For contrast: we have a very clear idea on what things are copyrighted and in general these things do not rest on a foundation of IP appropriated from others outside of the license terms. The fact that the infringement is fine grained and effectively harms the rights of 1000s or more individuals doesn't change the heart of the matter, whoever wrote the code: it wasn't you.
Given your bias I'm not surprised that this would be your argument though, effectively you have created a copyright laundromat using code that you were nominally the steward of and not the owner but whether it stands long term or not is not up to your lawyers.
You warrant you wrote the code yourself, then it is found your code infringes on code owned by other entities. Now you have a tough choice: admit you lied about writing your code yourself tainting all of the code you claim you wrote since these tools became available or stand and take the infringement penalty which could be very substantial.
Judges and courts don't like playing silly games like this.
I've sued two parties for copyright infringement and won and a third settled out of court for a substantial sum. You don't tell a judge you don't need to prove you wrote the code, that's an automatic loss. Then there are such things as expert witnesses who will interview you and check how much you know about the code you claim you wrote.
>I've sued two parties for copyright infringement and won and a third settled out of court for a substantial sum. You don't tell a judge you don't need to prove you wrote the code, that's an automatic loss. Then there are such things as expert witnesses who will interview you and check how much you know about the code you claim you wrote.
This doesn't really make sense; in no way can an "expert" interview definitively assert someone wrote a piece of code or not, especially if the person has access to the code beforehand.
He's making a point about responsibility/liability.
If you only get copyright for the prompt you make, but not the output, then it's like being responsible only for the prompt, but not the output.
Ie he's only responsible for pushing the boulder up the hill. The fact that it rolled down from the hill and crushed someone's house "isn't his fault" (he doesn't get copyright on it).
That is not how responsibility works anywhere. If you are stealing a gun and murder someone with that gun, you are still responsible, even if it is not your gun.
>The Office concludes that, given current generally available technology, prompts alone do not provide sufficient human control to make users of an AI system the authors of the output. Prompts essentially function as instructions that convey unprotectible ideas. While highly detailed prompts could contain the user’s desired expressive elements, at present they do not control how the AI system processes them in generating the output.
If you're not the author then why would you have to be liable for it?
> If you're not the author then why would you have to be liable for it?
If you do not understand this make sure that you always operate within a framework of people who do because this soft of misunderstanding can cause you a world of grief.
Because you are the person shipping it, and as such regular liability applies. If I'm not the author of a book, and make a lot of copies and distribute those I'm liable for the content of that book, regardless of whether or not I hold the copyright to it. Conversely, if the original author sues because they feel their work infringes then that too is a liability that stems from the distribution.
And 'distribution' is a pretty wide term, not unlike 'interstate commerce', lots of things that you might not consider to be distribution can be classified as such in court.
Different laws do not come in packages, they apply individually, and sometimes they apply collectively but it isn't a menu where you can pick the combination that you think makes the most sense.
Oh, I do understand it - laws are contradictory and can do whatever people shout out the most that they should do (but they don't always work that way). I just think that it is extremely bad when laws work this way.
Technically when you select "copy image" instead of "copy image url" and paste that to a friend you're often committing copyright infringement. Do I think this is reasonable? Absolutely not. The same goes for this - the author should hold liability, so make the person who ends up causing the work to exist the damn author.
But nooo, we can't have that. Instead we need to have these convoluted exceptions that don't at all work how the real world works, so that lawyers can have even more work.
Besides, if we go by "the law" then we already have a court case where training an AI model is protected by fair use. But obviously that isn't satisfying enough for people, so they keep talking about how it's stealing (refer to my first sentence).
Also, this situation is going to get funny when some country decides that AI generated content does get copyright protection.
> Oh, I do understand it - laws are contradictory and can do whatever people shout out the most that they should do (but they don't always work that way). I just think that it is extremely bad when laws work this way.
You are completely misunderstanding GP's distinction between ownership and liability.
In short, if you use someone else's car to kill someone, you are still liable for killing that person even though you don't own the car.
You can't really argue that things are in a certain way when that contradicts the way the law works, that's a recipe for disaster. The rules have been set, you can disagree with them and then you will be forced to litigate, which is both expensive and time consuming. Purposefully going against the grain is only for those with extremely deep pockets (and for lawyers...).
> Besides, if we go by "the law" then we already have a court case where training an AI model is protected by fair use.
Yes, but training an AI is a completely different thing than distributing the work product generated by that AI.
Note that I don't agree with all aspects of copyright law either, but I'll be happy to play by the rules as set today simply because I can't afford to be wrong and held liable for infringement. For instance I strongly believe that the length of copyright is a problem (and don't get me started on patents, especially on software). I also believe that only the original author should have copyright, not the company they worked for, their heirs (see Ravel for a really nasty case) or anybody else. I believe they should not be transferable at all.
But because I'm a nobody and not wealthy enough to challenge the likes of Disney in court I play by the rules.
As for 'this situation is going to get funny when some country decides that AI generated content does get copyright protection':
Copyright is one of the most harmonized legislative constructs in the world. Almost every country has adopted it, often without meaningful change. In practice US courts are obviously a very important driver behind changes in copyright law. But in general these changes tend to lean towards more protection for copyright owners, not less. So far the Trump admin has not touched copyright law in their usual heavy handed manner. I'm not sure if this is by design or by accident but maybe there are lines that even they can not easily cross without massive consequences.
Some parties in the AI/Copyright debate are talking about two sides of their mouth, for instance, Microsoft is heavily relying on being able to infringe on copyright at will but at the same time they are jealously guarding their own code. Such hypocrisy is going to be the main wedge that those in favor of strong copyright are going to use to reduce the chances that AI work product deserves copyright, after all, if it is original and not transformative then Microsoft could (and should!) train their AI on their own confidential code. But they're not doing that, maybe they know something you and I do not...
But that's not at all a comparable situation though, because it is your party. It doesn't matter where it is, we assign "ownership" of the party to you. Even the language we use explicitly states that. In the case of copyright, we explicitly states (by the copyright office), that you are not the author of an AI generated work.
In some places simply not keeping the public street in front of property ice-free can incur liability, even when you are not actually there when it snows. There are so many such examples I'm kind of surprised to see this kind of confused argument made here.
If that were true, a developer may own copyright over the source code, but nothing on the compiled binaries, and I could download practically all software available as compiled binaries and use for free.
Indeed a developer owns copyright over the source code and on the compiled binaries, because there is no expansion happening here but just a translation from one format into another, the kind of thing that has been ruled copyrightable since copyright exists. The same goes for translations from one human language into another, and anybody with knowledge of more than one language will be happy to acknowledge that translating is hard work. Even so, the translator does not hold copyright on the result, at best they can say they have created a derived work and it is the original author that continues to hold copyright.
Compilation and translation happen in a generic manner and does not rely on a mountain of other IP, it is really just a transformative tool that happens to do something useful, someone constructed it to be a very precise translation to the point that any mistakes in it are called bugs and we fix them to ensure the process stays deterministic. Translators try hard to 'get it right' too: to affect the intentions of the original author as little as possible.
When you use a model loaded up with noise or that you have trained exclusively on code that you actually wrote I think a strong case could be made that you own the copyright on that work product. But when you train that model on other people's work, especially without their consent or use a model that has been trained in that way you lose your right to call the output of that model yours.
You did not write it, and the transformative process requires terabytes of other people's IP and only a little bit by you.
As soon as you can prove that your contribution substantially outweighs the amount of IP contributed in total you would have a much stronger case.
>> No, that human owns the copyright on the prompt, not on the work product.
I think I may have misunderstood your original comment above. It seems intending to say:
No, that human owns the copyright on the prompt, not necessarily on the work product. The human may partially have copyright over the work product as well, "how much" being dependent on how much new creative expression from the human was involved vs that from others.
That is in fact correct.
Both the compiler (in absence of inclusion of copyrighted libraries) and the LLM are considered to not add creative work and thus do not change copyright status of the works they transform.
You can consider the training set of the LLM or other AI model to be 3rd party libraries and the level of copyright from them applying to final output to be how much can be directly considered derivative, just as reading copyrighted code and being inspired by it does not pass that copyright to your work unless it's obviously derivative
>> You can consider the training set of the LLM or other AI model to be 3rd party libraries ...
I like this comparison -- training set as '3rd party libraries'. Except, of course, that the authors behind the training set may not have actually granted permission to use, whereas the 3rd party libraries usually have some permission by way of license.
+1
Adding two subtle points:
>> Indeed a developer owns copyright over the source code and on the compiled binaries, because there is no expansion happening here but just a translation from one format into another ... does not rely on a mountain of other IP
... and, the license agreement of the compiler and libraries used / linked to practically always explicitly waive copyrights over the said non-mountain of IP.
>> As soon as you can prove that your contribution substantially outweighs the amount of IP contributed in total you would have a much stronger case.
... a much stronger case that you have a partial copyright over the work, which is now likely a derivative work. You still may not have a case that you own the copyright exclusively (or as the original article says, that your employer does).
> If that were true, a developer may own copyright over the source code, but nothing on the compiled binaries, and I could download practically all software available as compiled binaries and use for free.
If the compiled binaries (output) were produced by running the input (source code) over every program written, then sure.
But that's not what's happening with compilers, is it? The output of a prompt is dependent on copyrighted work of others every single time it is run.
The output of a compiler is not dependent on the copyright output of every other program.
I think your comments are originating in how I may have taken jacquesm's comment too literally, as I just wrote here https://news.ycombinator.com/item?id=47944938
However:
1. The "every"ies in your comment are not to be taken literally either. :-)
>> If the compiled binaries (output) were produced by running the input (source code) over every program written, then sure.
2. More importantly, the above seems cyclically dependent on whether output from generative AI is deemed to be in public domain or not, which I consider is an open-ended issue as of now. It is not so 'sure' as yet. :-)
That’s now how it works. The human using the tool (like claude code, etc) owns the copyright of the code generated.
No, you are wrong about this.
See:
https://technophilosoph.com/en/2025/02/07/ai-prompts-and-out...
If you have a more recent citation referring to case law that states the opposite then that would be great but afaik this article reflects the current state of affairs.
The human using the tool creates a prompt, there is then an automatic transformation of the prompt into code. Such automatic transformation is generally accepted as not to create a new work (after all, anybody else inputting the same prompt would have a reasonable expectation of generating the same output modulo some noise due to versioning and possibly other local context).
Claud code and in general AI generated code does not at present create a new work. But the prompt, that part which you input may be sufficiently creative to warrant copyright protection.
In the US, the copyright office (as the article you link to says), has declined to define “meaningful” contribution. If you want to argue that the user doesn’t own it for incredibly trivial prompts, I won’t argue (though I consider that to be non-useful code).
Every developer I’ve seen use these tools has have engaged in a meaningful contribution: specific directions across multiple prompts, often (though not always) editing the code afterwards, manually running the code and promoting for changes, etc.
Until the courts, legislators, or the copyright office define something otherwise, I’m highly confident of my assertion. (Mostly because of the insane number of hours I’ve spent with counsel on this. And, as a disclaimer, since I am biased: I worked on Copilot and Google’s various AI assisted coding products as an SVP and VP.)
If my business depended on a legal fiction to be true and I had invested a whole pile of effort + money into it being so then I would argue at every opportunity that 'of course it is legal'. But that's just a version of fake-it-until-you-make-it and in practice not all of those bets pay off.
The fact that meaningful contribution has not been defined is a strong signal that things are not nearly as clear cut as you make them out to be. Until there is a ruling that clearly establishes that the person that generated the prompt owns the copyright on the code I think it is misleading to suggest that this is already the case, your lawyers are not the lawyers of the parties that will end up hurt if it ends up not being so.
For contrast: we have a very clear idea on what things are copyrighted and in general these things do not rest on a foundation of IP appropriated from others outside of the license terms. The fact that the infringement is fine grained and effectively harms the rights of 1000s or more individuals doesn't change the heart of the matter, whoever wrote the code: it wasn't you.
Given your bias I'm not surprised that this would be your argument though, effectively you have created a copyright laundromat using code that you were nominally the steward of and not the owner but whether it stands long term or not is not up to your lawyers.
Prove I did not write my code if I do not tell you which tools I used. =}
That's not how that works.
You warrant you wrote the code yourself, then it is found your code infringes on code owned by other entities. Now you have a tough choice: admit you lied about writing your code yourself tainting all of the code you claim you wrote since these tools became available or stand and take the infringement penalty which could be very substantial.
Judges and courts don't like playing silly games like this.
I've sued two parties for copyright infringement and won and a third settled out of court for a substantial sum. You don't tell a judge you don't need to prove you wrote the code, that's an automatic loss. Then there are such things as expert witnesses who will interview you and check how much you know about the code you claim you wrote.
>I've sued two parties for copyright infringement and won and a third settled out of court for a substantial sum. You don't tell a judge you don't need to prove you wrote the code, that's an automatic loss. Then there are such things as expert witnesses who will interview you and check how much you know about the code you claim you wrote.
This doesn't really make sense; in no way can an "expert" interview definitively assert someone wrote a piece of code or not, especially if the person has access to the code beforehand.
They don't need to prove it 100%. They just have to show that it's likely you did.
I believe the standard can be as low as "more likely than not".
Obviously, we aren’t going to agree on this at all. I hope you have a good day.
So I’m responsible for pushing the giant boulder at the top of the hill.
The humans at the bottom who were crushed should blame the boulder, which happened to be moving.
I'm not sure what point you are trying to make.
He's making a point about responsibility/liability.
If you only get copyright for the prompt you make, but not the output, then it's like being responsible only for the prompt, but not the output.
Ie he's only responsible for pushing the boulder up the hill. The fact that it rolled down from the hill and crushed someone's house "isn't his fault" (he doesn't get copyright on it).
That is not how responsibility works anywhere. If you are stealing a gun and murder someone with that gun, you are still responsible, even if it is not your gun.
Well, you are responsible for the consequences. Liability is simply a different thing than copyright.
The copyright office says that you don't get copyright because you're not considered the author:
https://www.copyright.gov/ai/
>The Office concludes that, given current generally available technology, prompts alone do not provide sufficient human control to make users of an AI system the authors of the output. Prompts essentially function as instructions that convey unprotectible ideas. While highly detailed prompts could contain the user’s desired expressive elements, at present they do not control how the AI system processes them in generating the output.
If you're not the author then why would you have to be liable for it?
> If you're not the author then why would you have to be liable for it?
If you do not understand this make sure that you always operate within a framework of people who do because this soft of misunderstanding can cause you a world of grief.
Because you are the person shipping it, and as such regular liability applies. If I'm not the author of a book, and make a lot of copies and distribute those I'm liable for the content of that book, regardless of whether or not I hold the copyright to it. Conversely, if the original author sues because they feel their work infringes then that too is a liability that stems from the distribution.
And 'distribution' is a pretty wide term, not unlike 'interstate commerce', lots of things that you might not consider to be distribution can be classified as such in court.
Different laws do not come in packages, they apply individually, and sometimes they apply collectively but it isn't a menu where you can pick the combination that you think makes the most sense.
Oh, I do understand it - laws are contradictory and can do whatever people shout out the most that they should do (but they don't always work that way). I just think that it is extremely bad when laws work this way.
Technically when you select "copy image" instead of "copy image url" and paste that to a friend you're often committing copyright infringement. Do I think this is reasonable? Absolutely not. The same goes for this - the author should hold liability, so make the person who ends up causing the work to exist the damn author.
But nooo, we can't have that. Instead we need to have these convoluted exceptions that don't at all work how the real world works, so that lawyers can have even more work.
Besides, if we go by "the law" then we already have a court case where training an AI model is protected by fair use. But obviously that isn't satisfying enough for people, so they keep talking about how it's stealing (refer to my first sentence).
Also, this situation is going to get funny when some country decides that AI generated content does get copyright protection.
> Oh, I do understand it - laws are contradictory and can do whatever people shout out the most that they should do (but they don't always work that way). I just think that it is extremely bad when laws work this way.
You are completely misunderstanding GP's distinction between ownership and liability.
In short, if you use someone else's car to kill someone, you are still liable for killing that person even though you don't own the car.
Do you disagree with that statement?
You can't really argue that things are in a certain way when that contradicts the way the law works, that's a recipe for disaster. The rules have been set, you can disagree with them and then you will be forced to litigate, which is both expensive and time consuming. Purposefully going against the grain is only for those with extremely deep pockets (and for lawyers...).
> Besides, if we go by "the law" then we already have a court case where training an AI model is protected by fair use.
Yes, but training an AI is a completely different thing than distributing the work product generated by that AI.
Note that I don't agree with all aspects of copyright law either, but I'll be happy to play by the rules as set today simply because I can't afford to be wrong and held liable for infringement. For instance I strongly believe that the length of copyright is a problem (and don't get me started on patents, especially on software). I also believe that only the original author should have copyright, not the company they worked for, their heirs (see Ravel for a really nasty case) or anybody else. I believe they should not be transferable at all.
But because I'm a nobody and not wealthy enough to challenge the likes of Disney in court I play by the rules.
As for 'this situation is going to get funny when some country decides that AI generated content does get copyright protection':
Copyright is one of the most harmonized legislative constructs in the world. Almost every country has adopted it, often without meaningful change. In practice US courts are obviously a very important driver behind changes in copyright law. But in general these changes tend to lean towards more protection for copyright owners, not less. So far the Trump admin has not touched copyright law in their usual heavy handed manner. I'm not sure if this is by design or by accident but maybe there are lines that even they can not easily cross without massive consequences.
Some parties in the AI/Copyright debate are talking about two sides of their mouth, for instance, Microsoft is heavily relying on being able to infringe on copyright at will but at the same time they are jealously guarding their own code. Such hypocrisy is going to be the main wedge that those in favor of strong copyright are going to use to reduce the chances that AI work product deserves copyright, after all, if it is original and not transformative then Microsoft could (and should!) train their AI on their own confidential code. But they're not doing that, maybe they know something you and I do not...
If you hold an illegal party on public land, you would still be liable, even though you did not own the land.
But that's not at all a comparable situation though, because it is your party. It doesn't matter where it is, we assign "ownership" of the party to you. Even the language we use explicitly states that. In the case of copyright, we explicitly states (by the copyright office), that you are not the author of an AI generated work.
Same point goes to if an animal takes a picture.
In some places simply not keeping the public street in front of property ice-free can incur liability, even when you are not actually there when it snows. There are so many such examples I'm kind of surprised to see this kind of confused argument made here.