Yeah, I'm familiar with that argument re derivative work, but weights aren't really what's being shipped or sold, and I think it's reasonable to argue that the generated tokens aren't derivative but substantively transformed.

That said, I would prefer a situation where hyper-scalers make an effort to compensate sources of good data, e.g. newspapers and so on.

Like it or not, Bartz v. Anthropic established that as fair use. So it isn't legally copyright infringement as currently understood under the law. This may change but it isn't obviously wrong.