Dude come on, I clearly wasn't saying LLMs are people. My point was it's a tool and it's the responsibility of the person wielding it to check outputs.
If it's too hard to check outputs, don't use the tool.
Your arguments about copyright being different for LLMs: at the moment that's still being defined legally. So for now it's an ethical concern rather than a legal one.
For what it's worth I agree that LLMs being trained on copyright material is an abuse of current human oriented copyright laws. There's no way this will just continue to happen. Megacorps aren't going to lie down if there's a piece of the pie on the table, and then there's precedent for everyone else (class action perhaps)
Alright, I did make that assumption because I've seen and heard people talk about LLM as people. It worries me that otherwise functional and reasonable people, some of them my friends, have been so easily been convinced by a machine which demonstrated its flaws to me daily.
As for checking outputs - I don't believe that's sufficient. Maybe the letter of the law is flawed but according to the spirit the model itself is derivative work.
A model takes several orders of magnitude more work as training data than it takes to code the training algorithm itself, to any reasonable and sane person, that makes it a derivative work of the training data by nearly 100% - we can only argue how many nines it should be.
> precedent
Yeah but the US system makes me very uneasy about it. The right way to do this is to sit down, talk about the options and their downstream implications, talking about fairness and justice and then deciding what the law should be. If we did that, copyright law would look very different in the first place and this whole thing would have an obvious solution.