From the article:

> He fed only the API and the test suite to Claude and asked it to reimplement the library from scratch.

From GPL2:

> The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable.

Is a project's test suite not considered part of its source code? When I make modifications to a project, its test cases are very much a part of that process.

If the test suite is part of this library's source code, and Claude was fed the test suite or interface definition files, is the output not considered a work based on the library under the terms of LGPL 2.1?

It's transformative, so no.

Legally, using the tests to help create the reimplementation is fine.

However, it seems possible you can't redistribute the same tests under the MIT license. So the reimplementation MIT distribution could need to be source code only, not source code plus tests. Or, the tests can be distributed in parallel but still under LGPL, not MIT. It doesn't really matter since compiled software won't be including the tests anyways.

> It's transformative, so no.

I'm not following your logic there, and I don't see any mention of "transformative" in the license. Can you explain what you mean?

Sorry, I misspoke. Transformation is what makes the LLM itself legal -- its training data is sufficiently transformed into weights.

And so, a work being sufficiently transformative is one way in which copyright no longer applies, but that's not the case here specifically. The specific case here is essentially just a clean-room reimplementation (though technically less "clean", but still presumably the same legally). But the end result is still a completely different expression of underlying non-copyrightable ideas.

And in both cases, it doesn't matter what the original license was. If a resulting work is sufficiently transformative or a reimplementation, copyright no longer applies, so the license no longer applies.

That's interesting, but it misses my point:

The library's test suite and interfaces were apparently used directly, not transformed. If either of those are considered part of the library's source code, as the license's wording seems to suggest, then I think output from their use could be considered a work based on the library as defined in the license.

Legally that's been established as acceptable.

Google LLC v Oracle America assumed (though didn't establish) that API's are copyrightable... BUT that developing against them falls under fair use, as long as the function implementations are independent.

Test suites are again generally considered copyrightable... but the behavior being tested is not.

So no, it's not considered to be a work based on the library. This seems pretty clear-cut in US law by now.

Also, the LGPL text doesn't say "work based on the library". It says "If you modify a copy of the Library", and this is not a "combined work" either. And the whole point is that this is not a modified copy -- it's a reimplementation.

In theory, a license could be written to prevent running its tests from being run against software not derived from the original, i.e. clean-room reimplementations. In practice, it remains dubious whether any court would uphold that. And it would also be trivial to then get around it, by taking advantage of fair use to re-implement the tests in e.g. plain English (or any specification language), and then re-implementing those back into new test code. Because again, test behaviors are not copyrightable.

> Also, the LGPL text doesn't say "work based on the library".

It does, about a dozen times.

Are you perhaps referring to LGPL3? I think the license under discussion here is LGPL2.1.

https://github.com/chardet/chardet/blob/6.0.0/LICENSE

I'm not well versed in copyright case law, so I won't argue with the rest of what you wrote. Thanks for elaborating on your thoughts.

Oh sorry, I was indeed looking at the newer LGPL version. I stand corrected, thanks! But yes, all the same points stand. Good discussion!

But the tests were transformed to the new language, they are not copied as-is.

Software patents would work as you describe, but not copyright.

Google v. Oracle ruled that use of APIs are fair game and could be argued that test cases are strictly a use of APIs and not implementation.

Google vs Oracle ruled that APIs fall under copyright (the contrary was thought before). However, it was ruled that, in that specific case, fair use applied, because of interoperability concerns. That's the important part of this case: fair use is never automatic, it is assessed case by case.

Regarding chardet, I'm not sure "I wanted to circumvent the license" is a good way to argue fair use.