There is also the possibility that an LLM judge would be happy with some code that looks like LLM generated code. But a maintainer for a specific project might not merge it for stylistic reasons
There is also the possibility that an LLM judge would be happy with some code that looks like LLM generated code. But a maintainer for a specific project might not merge it for stylistic reasons
I think the intent was to specifically train an LLM to judge what a specific maintainer would consider to be good style.