I use AI for most of those things. And I think it probably saves me a bit of time.
But in that study that came out a few weeks ago where they actually looked at time saved, every single developer overestimated their time saved. To the point where even the ones who lost time thought they saved time.
LLMs are very good at making you feel like you’re saving time even when you aren’t. That doesn’t mean they can’t be a net productivity benefit.
But I’d be very very very surprised if you have real hard data to back up your feelings about your work taking you half as long and being equal quality.
That study predates Claude Code though.
I’m not surprised by the contents. I had the same feeling; I made some attempts at using LLMs for coding prior to CC, and with rare exceptions it never saved me any time.
CC changed that situation hugely, at least in my subjective view. It’s of course possible that it’s not as good as I feel it is, but I would at least want a new study.
I don’t believe that CC is so much better than cursor using Claude models that it moves the needle enough to flip the results of that study.
The key thing to look at is that even the participants that did objectively save time, overestimated time saved by a huge amount.
But also you’re always likely to be at least one model ahead of any studies that come out.
> That study predates Claude Code though.
Is there a study demonstrating Claude Code improves productivity?
I mean, I used to average 2 hours of intense work a day and now it’s 1 hour.
How are you tracking that? Are you keeping a log, or are you just guessing? Do you have a mostly objective definition of intense work or are you just basing it on how you feel? Is your situation at work otherwise exactly the same, or have you gotten into a better groove with your manager? Are you working on exactly the same thing? Have you leveled up with some more experience? Have you learned the domain better?
Is your work objectively the same quality? Is it possible that you are producing less but it’s still far above the minimum so no one has noticed? Is your work good enough for now, but a year from now when someone tries to change it, it will be a lot harder for them?
Based on the only real studies we have, humans grossly overestimate AI time savings. It’s highly likely you are too.
_sigh_. Really dude? Just because people overestimate them on average doesn’t mean every person does. In fact, you should be well versed enough about the statistics to understand that it will be a spectrum that is highly dependent on both a persons role and how they use it.
For any given new tool, a range of usefulness that depends on many factors will affect people differently as individuals. Just because a carpenter doesn’t save much time because Microsoft excel exists doesn’t mean it’s not a hugely useful tool, and doesn’t mean it doesn’t save a lot of time for accountants, for example.
Instead of trying to tear apart my particular case, why not entertain the possibility that it’s more likely I’m reporting pretty accurately but it’s just I may be higher up that spectrum - with a good combo of having a perfect use case for the tool and also using the tool skilfully?
> _sigh_. Really dude? Just because people overestimate them on average doesn’t mean every person does.
In the study, every single person overestimated time saved on nearly every single task they measured.
Some people saved time, some didn’t. Some saved more time, some less. But every single person overestimated time saved by a large margin.
I’m not saying you aren’t saving time, but it’s very unlikely that if you aren’t tracking things very carefully that you are overestimating.
I’ll admit it’s possible my estimates are off a bit. What isn’t up for debate though is that it’s made a huge difference in my life and saved me a ton of time.
The fact that people overestimate its usefulness is somewhat of a “shrug” for me. So long as it _is_ making big differences, that’s still great whether people overestimate it or not.
If people overestimate time saved by huge margins, we don’t know whether it’s making big differences or not. Or more specifically whether the boost is worth the cost (both monetary and otherwise).
Only if we’re only using people’s opinions as data. There are other ways to do this.