The practical concern of Linux developers regarding responsibility is not being able to ban the author, it's that the author should take ongoing care for his contribution.
A DCO bearing a claim of original authorship (or assertion of other permitted use) isn't going to shield them entirely, but it can mitigate liability and damages.
In a court case the responsibility party very well could be the Linux foundation because this is a foreseeable consequence of allowing AI contributions. There’s no reasonable way for a human to make such a guarantee while using AI generated code.
It’s not about the mechanism: responsibility is a social construct, it works the way people say that it works. If we all agree that a human can agree to bear the responsibility for AI outputs, and face any consequences resulting from those outputs, then that’s the whole shebang.
Sure we could change the law. It would be a stupid change to allow individuals, organizations, and companies to completely shield themselves from the consequences of risky behaviors (more than we already do) simply by assigning all liability to a fall guy.
Right now it's very easy not to infringe on copyrighted code if you write the code yourself. In the vast majority of cases if you infringed it's because you did something wrong that you could have prevented (in the case where you didn't do anything wrong, inducement creation is an affirmative defense against copyright infringement).
That is not the case when using AI generated code. There is no way to use it without the chance of introducing infringing code.
Because of that if you tell a user they can use AI generated code, and they introduce infringing code, that was a foreseeable outcome of your action. In the case where you are the owner of a company, or the head of an organization that benefits from contributors using AI code, your company or organization could be liable.
So it's a bit as if Linux Organization told its contributors you can bring in infringing code but you must agree you are liable for any infringement?
But if a lawsuit was later brought who would be sued? The individual author or the organization? In other words can an organization reduce its liability if it tells its employees "You can break the law as long as you agree you are solely responsible for such illegal actions?
It would seem to me that the employer would be liable if they "encourage" this way of working?
A human has to willingly violate the law for that to happen though. There is no way for a human to use AI generated that doesn't have a chance of producing copyrighted code though. That's just expected.
If you don't think this is a problem take a look at the terms of the enterprise agreements from OpenAI and Anthropic. Companies recognize this is an issue and so they were forced to add an indemnification clause, explicitly saying they'll pay for any damages resulting in infringement lawsuits.
They don’t produce enough similar code to infringe frequently. And if they did independent creation is an affirmative defense to copyright infringement that likely doesn’t apply to LLMs since they have the demonstrated capability to produce code directly from their training set.
You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.
On independent creation: you are conflating the tool with the user. The defense applies to whether the developer had access to the copyrighted work, not whether their tools did. A developer using an LLM did not access the training set directly, they used a synthesis tool. By your logic, any developer who has read GPL code on GitHub should lose independent creation defense because they have "demonstrated capability to produce code directly from" their memory.
LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case). Training set contamination happens, but it is rare and considered a bug. Humans also occasionally reproduce code from memory: we do not deny them independent creation defense wholesale because of that capability!
In any case, the legal question is not settled, but the argument that LLM-assisted code categorically cannot qualify for independent creation defense creates a double standard that human-written code does not face.
> You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.
Practically speaking humans do not produce code that would be found in court to be infringing without intent.
It is theoretically possible, but it is not something that a reasonable person would foresee as a potential consequence.
That’s the difference.
> LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case).
Exactly. It is a documented failure mode that you as a user have no capacity to mitigate or to even be aware is happening.
Double standards are perfectly fine. LLMs are not conscious beings that deserve protection under the law.
>not settled.
What appears to likely be settled is that human authorship is required, so there’s no way that an LLM could qualify for independent creation.
And that's not an infringement. Actual copying is the infringement, not having the same code. The most likely way to have the same code is by copying, but it's not the only way.
Imagine your a factory owner and you need a chemical delivered from across the country, but the chemical is dangerous and if the tanker truck drives faster than 50 miles per hour it has a 0.001% chance per mile of exploding.
You hire an independent contractor and tell him that he can drive 60 miles per hour if he wants to but if it explodes he accepts responsibility.
He does and it explodes killing 10 people. If the family of those 10 people has evidence you created the conditions to cause the explosion in order to benefit your company, you're probably going to lose in civil court.
Linus benefits from the increase velocity of people using AI. He doesn't get to put all the liability on the people contributing.
Why would I put much effort into responding to a post like yours, which makes no sense and just shows that you don't understand what you're talking about?
Responsibility is an objective fact, not just some arbitrary social convention. What we can agree or disagree about is where it rests, but that's a matter of inference, an inference can be more or less correct. We might assign certain people certain responsibilities before the fact, but that's to charge them with the care of some good, not to blame them for things before they were charged with their care.
Because contributions to Linux are meticulously attributed to, and remain property of, their authors, those authors bear ultimate responsibility. If Fred Foobar sends patches to the kernel that, as it turns out, contain copyrighted code, then provided upstream maintainers did reasonable due diligence the court will go after Fred Foobar for damages, and quite likely demand that the kernel organization no longer distribute copies of the kernel with Fred's code in it.
Anyone distributing infringing material can be liable, and it’s unlikely that this technicality will actually would shield anyone.
Anyone who thinks they have a strong infringement case isn’t going to stop at the guy who authored the code, they’re going to go after anyone with deep pockets with a good chance of winning.
The practical concern of Linux developers regarding responsibility is not being able to ban the author, it's that the author should take ongoing care for his contribution.
That's not going to shield the Linux organization.
A DCO bearing a claim of original authorship (or assertion of other permitted use) isn't going to shield them entirely, but it can mitigate liability and damages.
Can it though? As far as I know this hasn’t been tested.
In a court case the responsibility party very well could be the Linux foundation because this is a foreseeable consequence of allowing AI contributions. There’s no reasonable way for a human to make such a guarantee while using AI generated code.
It’s not about the mechanism: responsibility is a social construct, it works the way people say that it works. If we all agree that a human can agree to bear the responsibility for AI outputs, and face any consequences resulting from those outputs, then that’s the whole shebang.
Sure we could change the law. It would be a stupid change to allow individuals, organizations, and companies to completely shield themselves from the consequences of risky behaviors (more than we already do) simply by assigning all liability to a fall guy.
What law exactly are you suggesting needs to be changed? How is this any different from what already happens right now, today?
Right now it's very easy not to infringe on copyrighted code if you write the code yourself. In the vast majority of cases if you infringed it's because you did something wrong that you could have prevented (in the case where you didn't do anything wrong, inducement creation is an affirmative defense against copyright infringement).
That is not the case when using AI generated code. There is no way to use it without the chance of introducing infringing code.
Because of that if you tell a user they can use AI generated code, and they introduce infringing code, that was a foreseeable outcome of your action. In the case where you are the owner of a company, or the head of an organization that benefits from contributors using AI code, your company or organization could be liable.
So it's a bit as if Linux Organization told its contributors you can bring in infringing code but you must agree you are liable for any infringement?
But if a lawsuit was later brought who would be sued? The individual author or the organization? In other words can an organization reduce its liability if it tells its employees "You can break the law as long as you agree you are solely responsible for such illegal actions?
It would seem to me that the employer would be liable if they "encourage" this way of working?
It’s a foreseeable outcome that humans might introduce copyrighted code into the kernel.
I think you’re looking for problems that don’t really exist here, you seem committed to an anti AI stance where none is justified.
A human has to willingly violate the law for that to happen though. There is no way for a human to use AI generated that doesn't have a chance of producing copyrighted code though. That's just expected.
If you don't think this is a problem take a look at the terms of the enterprise agreements from OpenAI and Anthropic. Companies recognize this is an issue and so they were forced to add an indemnification clause, explicitly saying they'll pay for any damages resulting in infringement lawsuits.
> Right now it's very easy not to infringe on copyrighted code if you write the code yourself.
Humans routinely produce code similar to or identical to existing copyrighted code without direct copying.
They don’t produce enough similar code to infringe frequently. And if they did independent creation is an affirmative defense to copyright infringement that likely doesn’t apply to LLMs since they have the demonstrated capability to produce code directly from their training set.
You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.
On independent creation: you are conflating the tool with the user. The defense applies to whether the developer had access to the copyrighted work, not whether their tools did. A developer using an LLM did not access the training set directly, they used a synthesis tool. By your logic, any developer who has read GPL code on GitHub should lose independent creation defense because they have "demonstrated capability to produce code directly from" their memory.
LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case). Training set contamination happens, but it is rare and considered a bug. Humans also occasionally reproduce code from memory: we do not deny them independent creation defense wholesale because of that capability!
In any case, the legal question is not settled, but the argument that LLM-assisted code categorically cannot qualify for independent creation defense creates a double standard that human-written code does not face.
> You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.
Practically speaking humans do not produce code that would be found in court to be infringing without intent.
It is theoretically possible, but it is not something that a reasonable person would foresee as a potential consequence.
That’s the difference.
> LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case).
Exactly. It is a documented failure mode that you as a user have no capacity to mitigate or to even be aware is happening.
Double standards are perfectly fine. LLMs are not conscious beings that deserve protection under the law.
>not settled.
What appears to likely be settled is that human authorship is required, so there’s no way that an LLM could qualify for independent creation.
And that's not an infringement. Actual copying is the infringement, not having the same code. The most likely way to have the same code is by copying, but it's not the only way.
In this case, the "fall guy" is the person who actually introduced the code in question into the codebase.
They wouldn't be some patsy that is around just to take blame, but the actual responsible party for the issue.
Imagine your a factory owner and you need a chemical delivered from across the country, but the chemical is dangerous and if the tanker truck drives faster than 50 miles per hour it has a 0.001% chance per mile of exploding.
You hire an independent contractor and tell him that he can drive 60 miles per hour if he wants to but if it explodes he accepts responsibility.
He does and it explodes killing 10 people. If the family of those 10 people has evidence you created the conditions to cause the explosion in order to benefit your company, you're probably going to lose in civil court.
Linus benefits from the increase velocity of people using AI. He doesn't get to put all the liability on the people contributing.
Cool analogy! Which has nothing to do with the topic in hand.
That is a nonsensical analogy on multiple levels, and doesn't even support your own argument.
Nice rebuttal.
Why would I put much effort into responding to a post like yours, which makes no sense and just shows that you don't understand what you're talking about?
Why would you put any effort into it at all?
Responsibility is an objective fact, not just some arbitrary social convention. What we can agree or disagree about is where it rests, but that's a matter of inference, an inference can be more or less correct. We might assign certain people certain responsibilities before the fact, but that's to charge them with the care of some good, not to blame them for things before they were charged with their care.
Because contributions to Linux are meticulously attributed to, and remain property of, their authors, those authors bear ultimate responsibility. If Fred Foobar sends patches to the kernel that, as it turns out, contain copyrighted code, then provided upstream maintainers did reasonable due diligence the court will go after Fred Foobar for damages, and quite likely demand that the kernel organization no longer distribute copies of the kernel with Fred's code in it.
Anyone distributing infringing material can be liable, and it’s unlikely that this technicality will actually would shield anyone.
Anyone who thinks they have a strong infringement case isn’t going to stop at the guy who authored the code, they’re going to go after anyone with deep pockets with a good chance of winning.
> Anyone distributing infringing material can be liable
There is still the "mens rea" principle. If you distribute infringing material unknowingly, it would very likely not result in any penalties.
Copyright is strict liability. There’s no mens rea required.