> The idea that I could eventually ask ChatGPT or whatever about obscure things in my field, and get useful output (of the "trust but verify" sort), is exciting.
That's your idea, not the one they are going with.
Their idea is that you pay a fee to access any information that was freely available.
Your idea is tearing down of fences, their idea is gatekeeping. The two ideas are incompatible.
> Their idea is that you pay a fee to access any information that was freely available.
An LLM containing the information doesn’t take away from the book being available at the library.
It’s an additional way to access the information. A company charging a fee for it doesn’t stop you from going to the library if you want to.
> Your idea is tearing down of fences, their idea is gatekeeping. The two ideas are incompatible.
You act like the parent commenter is permanently stealing the book from the library and gifting it to a private training set.
Information being available from more places, even if some are paid, doesn’t mean gatekeeping.
There are also open weight LLMs that can be run locally. Some of these are being fine tuned for specific topics against topical datasets which is opening up even more interesting opportunities (this is exactly what the linked article is about)
Who is "their"?
There are plenty of open models you can download today and run. No gatekeeping. No fencing.
This whole "AI is evil" trope is getting a bit tired.
Their idea is being able to get answers to questions which were difficult to answer before[0]. Of course they want to get paid for it. The information wasn’t available easily and not always[1] freely.
[0] among other things…
[1] more like ‘often not at all’
> Of course they want to get paid for it.
So should the original authors, no? That is, getting a share of that payment.
Something akin to the German GEMA could work, an entity that levies a usage fee on behalf of all copyright holders and re-distributes to its members, but on a global scale.
> So should the original authors, no? That is, getting a share of that payment.
Should they? Yes. Will they?
Well, do LLM model builders pay for any copyrighted work so far?
Well, not yet. It's a matter of organization, regulation and litigation.
I was thinking along the lines of concepts that already exist, such as the private copying levy [0]. It basically forces a blanket tax on a certain class of products, which then gets redistributed to members of a collecting society such as GEMA [1].
This way, you would force LLM model builders to effectively pay a tax by law. Since these models do not work at all without underlying content, make it proportionate. Let's say 50-70% to make it fair.
[0] https://en.wikipedia.org/wiki/Private_copying_levy
[1] https://en.wikipedia.org/wiki/GEMA_(German_organization)
> Their idea is that you pay a fee to access any information that was freely available.
And that will eventually be distilled into open weighted models.