I can understand where they come from. If most of the pull-requests were AI-coded, well, the maintainers are equally capable of prompting Claude Code themselves.

I think the whole game of software engineering, open source or not, has completely changed. A lump of code doesn't mean or imply the same thing as it did 2 years ago.

I think this is the key point.

A few years ago, if I send a complex PR that compiles and passes tests, that implies a certain amount of time and cognitive investment on my part. It seems likely that I wouldn't invest that if I didn't also understand the codebase, the feature or bug I'm working on, etc.

Now, that understanding is roughly as expensive as before, but AI has vastly reduced the cost of generating the code that compiles and passes tests.

Probably-well-intentioned community members are happy to contribute the cheap thing( Claude Code tokens) but, because it's so cheap, it's not a good indicator they've contributed the expensive thing (human understanding).

Also, this paper seems relevant: https://www.nber.org/papers/w35275

"Writing Code vs. Shipping Code: Productivity Effects Across Generations of AI Coding Tools"

As the FT summarizes,

> They found an explosive impact at the top of this funnel — coders created or edited almost 300 per cent more files — but that boost was halved to 150 per cent by the time they got to the number of discrete pieces of work submitted for review, and that in turn shrunk fivefold to a roughly 30 per cent uplift in the number of full software releases.

https://www.ft.com/content/8e9ae7a4-7209-4e2c-aa36-f3af77d6c...

So as I wrote, AI vastly improves labor productivity on _coding_, but not nearly as much on code _review_ or other parts of the release pipeline.

And, unfortunately, for many open source projects, it's easy for volunteers to send code for review, but hard for them to volunteer reviewing PRs, managing releases, etc.

> that implies a certain amount of time and cognitive investment on my part

Yes, this is the takeaway for me. A PR can no longer be a reasonable proof of work.

> If most of the pull-requests were AI-coded, well, the maintainers are equally capable of prompting Claude Code themselves.

I see this position a bit: the notion that AI-generated code has no value. I think it's easy to generate zero-value code, but I don't agree that all AI-generated code is zero-value. I've been working on my side projects in OpenCode, and I spend quite a bit of time prompting, setting up the right files, descriptions of the product I'm trying to build, and the roadmap for it. I have a tight validation loop that lets me run through a bunch of automated checks after each change, and then I do a bunch of manual testing through edge cases that the generated feature might screw up, and then I iterate. It's a different kind of work, but I can make progress more quickly than I could coding by hand. Validation loops are the main critical component.

My experience doing this over the past months is that using AI to code is a skill, and I learn new techniques and get better at it as I try stuff. But that also suggests that, when done well, it can produce something of value.

All of this is to say: while I take issue with your first sentence, I completely agree with your second sentence. What we've lost is the ability to distinguish easily between something well-thought out and something generated thoughtlessly. Focusing on cheap validation would help here immensely, as well.

The code just isn’t the main effort of work anymore. Anyone can generate the implementation, so it makes more sense than ever to instead hammer out the what, why, and how that underlies any code change.

I see all projects moving this direction. Makes more sense to hash out a plan together.

Code was never the main effort of work, but it was a clear signal that someone has done the main effort, which is understanding the codebase, designing a new feature, or investigating a bug, and have the knowledge to write the code. By the time you get to review, you can expect a knowledgeable person on the other end.

It’s the same about published journal article. A lot of them are a few pages. That is mostly one hour of typing. But everyone knows that typing it is not the work.

Right, and all of that is what I consider to be the "code" effort:

Deep research in the codebase, deciding on the flavor of code to write that matches the project, deciding how you'll model the feature with types, how to architect it so that it's testable, writing the tests, foreseeing cases beyond the obvious path, etc.

What changed is that it can be automated. Or, just grant a world where AI is perfect at implementation.

Now our time/energy/attention is freed up to concentrate the work around planning what to build. And the interesting part is that it becomes the input into the AI implementor.

This is a good thing since we tended to skip the planning stage since it's hard in its own way. Or we start building something and then try to synthesize a high level direction from it, yet now since refactoring is so expensive, we're committed to a solution.

As they say, just send me the prompt instead, at least that's more useful.

For anything but the most trivial change, a prompt is not enough though. There's a long iterative process of generating the right code, reviewing it, testing it, experimenting with UX or design for maintainability, fixing bugs... even a predominantly AI-generated PR can capture a lot of value. But apparently trying to distinguish those from the 'one-shot' vibe coded PRs is too much work for the Ladybird team.

> But apparently trying to distinguish those from the 'one-shot' vibe coded PRs is too much work for the Ladybird team.

Yes, that is exactly what this announcement is about. That it was too much work for them to tell those two apart.

Migh as well just write the thing out yourself - you will learn something by doin that and it will be easier next time. :)

> But apparently trying to distinguish those from the 'one-shot' vibe coded PRs is too much work

That is exactly the issue. Projects that are end-user applications - as opposed to libraries or development tools - probably see far more slop than actual work like you've described. The yields are too low for it to make any sense to try to figure out which is which, their time is better spent doing the work.

yeah but they could get free token usage from the community

Yeah, but then it’s either an arduous manual review or incurring a bunch of token usage to review something that may be slop.

When I contribute to OSS with AI I’m essentially engaging in a donation matching scheme where Anthropic matches 1 to 20 the dollar value invested (usually I can get ~2k of value per month on the $100 plan) in the open source project.

So any project that doesn’t accept AI PRs is really missing out on significant investment

> (usually I can get ~2k of value per month on the $100 plan)

Would you pay 2000$ for those tokens? If not, the number is meaningless.

[deleted]

> the maintainers are equally capable of prompting Claude Code themselves

I'm 100% on the side of maintainers here, but this is BS. If you could "just prompt Claude yourself" the AI productivity boosts would be in hundreds if not thousands of percent, which is demonstrably and self-evidently not the case (at least as of June 2026).

[dead]