How do you do phase 2 with an LLM when the LLM is likely trained on the original source code? Isn't this equivalent of "rewriting" Harry Potter by describing the plot to an LLM trained on the original books[1]?
[1] https://arstechnica.com/features/2025/06/study-metas-llama-3...
Well, check out the "clean rewrite" design document, directly: https://github.com/chardet/chardet/commit/f51f523506a73f89f0... referenced in https://github.com/chardet/chardet/issues/327#issuecomment-4...
Writing in a plan "no GPL/LGPL code" does not actually mean "forget all the GPL/LGPL code that you have ever seen, so that you start from a clean slate".
Agreed, no amount of system/user prompt directives change the fact that the LLM has already been trained on copyrighted code. It's amazing how many people fail to grasp that.
This is the "Don't think of a pink elephant" fallacy all over again.