> all evidence points to AI bringing at least 10-20% more productivity.

No, actually all evidence points exactly to a ~20% slowdown> https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

The "evidence" that you think about is probably that dopamine hit you felt when the shit-generator spat out a complete half-finished react app. But that's not evidence of increased productivity, unless we now measure productivity by the size of the codebase bloat.

Newer evidence from same group invalidates the outdated claim https://metr.org/blog/2026-05-11-ai-usage-survey/

No it doesn't - that data is self-reported, and the 2025 study explicitly compares _against_ self-reported data.

Granted, I agree models have improved since then, but still.

> Importantly, survey results are not necessarily grounded in reality. There are reasons to be skeptical of people’s responses to counterfactual questions such as about AI’s effect on productivity — for instance, our study in early 2025 found that people overestimated AI’s effect on their time spent on tasks by 40 percentage points on average.

- your linked study

> No, actually all evidence points exactly to a ~20% slowdown>

It contradicts this

Not really? I mean, the original study says this, in essence:

1. Developers _self_ report a 1.5-2x increase in productivity 2. Empirical measurements show 20% slowdown

Now, that study is from 2025, so it may be outdated.

The study you linked and claim contradicts the 20% slowdown is only _self reported_ data, which the original study proves is unreliable and overestimates productivity.

In fact, the study _you linked_ says this:

> In 2 of these 7 cases we were able to view public outputs from the work completed using AI. We are confident that in both cases the participants are overstating their change in value produced as we understand it; at least, the enormously more valuable work is not externally visible.

> In 2 other cases, we are somewhat suspicious of large changes in the value of work produced because qualitative claims do not match our intuitive sense of agent capabilities.

> There is 1 instance where our best sense is that the respondent does indeed have agents doing an impressive quantity of productive work, although we suspect that this work is better captured by improved speed or quantity of output, not improved value of output.

> In the remaining 2 cases we lack sufficient qualitative information to meaningfully interpret.

(edits for formatting)