Because while you may be a good actor, there are just as many bad actors out there.

How does Anthropic or OpenAI differentiate between the two?

Once you solve that, you can get access to Mythos ;)

More importantly who gets to decide good or bad?

Remember all of these models are based on unimaginable levels of copyright infringement. Is OpenAI a bad actor, that they use their models to infringe on the rights of others?

This isn't a moral argument. This is all about power and money, not good or bad. That includes the Mythos ban. Good vs bad actors is political theater designed to distract from what's actually going on.

> unimaginable levels of copyright infringement

This isn't how copyright works. The models don't wholesale encode literal information from original works and are substantive transformations. Now, you yourself as a user can use the models and weights to infringe on a copyright.

There have been some US cases about this, but it isn't generally settled internationally. "Fair use" is a US specific thing. Even in the US there are ongoing cases.

Paper about how weights are a derivative work of the training data: https://arxiv.org/abs/2407.13493

Currently in progress law suits about AI copyright: https://informationisbeautiful.net/visualizations/the-rise-o...

Yeah, I'm familiar with that argument re derivative work, but weights aren't really what's being shipped or sold, and I think it's reasonable to argue that the generated tokens aren't derivative but substantively transformed.

That said, I would prefer a situation where hyper-scalers make an effort to compensate sources of good data, e.g. newspapers and so on.

Like it or not, Bartz v. Anthropic established that as fair use. So it isn't legally copyright infringement as currently understood under the law. This may change but it isn't obviously wrong.

I think parent poster was referring to the open secret that the early models were trained on massive collections of pirated novels and textbooks.

> How does Anthropic or OpenAI differentiate between the two?

So if they can't why do some companies still get access today? Just 1s much bigger than "us".

It's the equivalent of saying a company like Amazon or Cloudflare should block access to web hosting or "illegal hosting". The argument back then was they aren't gatekeepers? But now they are?

This is really odd taking two completely different things and trying to apply law against them. Hosting was somewhat protected by previous rulings, selling AI services is not.

> This is really odd taking two completely different things and trying to apply law against them. Hosting was somewhat protected by previous rulings, selling AI services is not.

What's different? What's not protected? And what's "hosting"? Where do you draw the line with "managed services"?

So if you use "AI" to hack a computer it is different to using "hosting" to put "illegal content"?

Are you implying 1 of them is legal? But both are for the judge to decide.

OR if this is about the provider -- who's selling AI services? It's LLM. Just running software on GPUs. There's no AI. There, done. Same.