I think it's a complicated issue.
A lot of low quality AI contributions arrive using free tiers of these AI models, the output of which is pretty crap. On the other hand, if you max out the model configs, i.e. get "the best money can buy", then those models are actually quite useful and powerful.
OSS should not miss out on the power LLMs can unleash. Talking about the maxed out versions of the newest models only, i.e. stuff like Claude 4.5+ and Gemini 3, so developments of the last 5 months.
But at the same time, maintainers should not have to review code written by a low quality model (and the high quality models, for now, are all closed, although I heard good things about Minmax 2.5 but I haven't tried it).
Given how hard it is to tell which model made a specific output, without doing an actual review, I think it would make most sense to have a rule restricting AI access to trusted contributors only, i.e. maintainers as a start, and maybe some trusted group of contributors where you know that they use the expensive but useful models, and not the cheap but crap models.
It's the difference between raw LLM output vs LLM output that was tweaked, reviewed and validated by a competent developer.
Both can look like the same exact type of AI-generated code. But one is a broken useless piece of shit and the other actually does what it claims to do.
The problem is just how hard it is to differentiate the two at a glance.
> It's the difference between raw LLM output vs LLM output that was tweaked, reviewed and validated by a competent developer.
This is one of those areas where you might have been right.. 4-6 months ago. But if you're paying attention, the floor has moved up substantially.
For the work I do, last year the models would occasionally produce code with bugs, linter errors, etc, now the frontier models produce mostly flawless code that I don't need to review. I'll still write tests, or prompt test scenarios for it but most of the testing is functional.
If the exponential curve continues I think everyone needs to prepare for a step function change. Debian may even cease to be relevant because AI will write something better in a couple of hours.
This very much depends on the domain you work in. Small projects in well tread domains are incredible for AI. SaaS projects can essentially be one-shot. But large projects, projects with specific standards or idioms, projects with particular versions of languages, performance concerns, hardware concerns, all things the Debian project has to deal with, aren't 'solved' in the same way.
The tacit understanding of all these is that the valued contributors can us AI as long as they can "defend the code" if you will, because AI used lightly and in that way would be indistinguishable from knuthkode.
The problem is having an unwritten rule is sometimes worse than a written one, even if it "works".