In my opinion you are just wrong.
It’s an absolute game changer, and it can now multiply your productivity fivefold if it’s a solo greenfield project.
Maybe half a year ago it was as you said. You had to wait for the agent to finish, you had to review carefully, and often the result was not that great. You did not save a lot of time.
Now I can spin up 3+ parallel conversations in Codex, each in a git worktree. My work is mainly QA testing the features, refining the behavior, and sometimes making architectural decisions.
The results are now undeniable. In the past I could not have developed a product of that scope in my free time.
That is what is possible today. I suspect many engineers have not yet tried things that became feasible over the last months. Like parallel agents, resolving merge conflicts, separating out functionality from a large branch into proper PRs.
"many engineers have not yet tried things that became feasible over the last months"
I have heard this statement every single day for 2 years and yet we still have no companies compressing 10 years into 1 year thus exploding past all the incumbents who don't "get it".
Well, the GP mentioned
> if it’s a solo greenfield project
which is a pretty large caveat. Anecdotally, I've found my side projects (which are solo greenfield projects, and don't need to be supported to the same standards as enterprise software) have gained the boost the GP was talking about.
At work, it's different, since design, review, and maintenance is much more onerous.
If you want an example of a project that condensed 5 years into 6 months and exploded past the competition I suggest looking at OpenClaw.
The first line of code was written on November 25th. It achieved adoption in the "personal agents" space that far exceeded the other companies that had tried the same thing.
(Whether or not you trust the quality of the software you can't deny the impact it had in such a short time. It defined a new category of software.)
OpenClaw is definitely not a "5 years" project pre-AI though. That was more like a month of greenfield work compressed into a weekend -- which is still really impressive, don't get me wrong! -- but I think the point is we're not seeing mature, legacy codebases get outcompeted by new, agile, AI-driven codebases; we're seeing greenfield projects get spun up faster. Which, again, is still impressive and valuable.
If agents could really compress 10 years of development into 1 year, you'd see people making e.g. HFT platforms and becoming obscenely rich, not making a fun open-source project and getting hired by OpenAI as an employee.
41,964 commits is a lot more than "a month of greenfield work".
https://tools.simonwillison.net/github-repo-stats?repo=OpenC...
Didn’t we learn anything from the past? Using loc or number of commits or github stars to measure success or productivity is so backwards. It seems everyone on the AI wagon is either young (and so they don’t know our history) or simply forgot about all the good practices in software engineering
My bashscript can do that in some hours. The git repo contains no working software after that, but when that is what you want to meassure...
> 41,964 commits is a lot more than "a month of greenfield work".
I meant a month for the initial release, not current state.
Regardless, much like lines of code, number of commits is not a good metric, not even as a proxy, for how much "work" was actually done. Quickly browsing there are plenty[0] of[1] really[2] small[3] commits[4]. Agentic coding naturally optimizes for small commits because that's what the process is meant to do, but it doesn't mean that more work is being done, or that the work is effective. If anything, looking at the changelog[5] OpenClaw feels like a directionless dumpster fire right now. I would expect a lot more from a project if it had multiple people working on it for 5 years, pre-AI.
[0] https://github.com/openclaw/openclaw/commit/e43ae8e8cd1ffc07...
[1] https://github.com/openclaw/openclaw/commit/377c69773f0a1b8e...
[2] https://github.com/openclaw/openclaw/commit/ffafa9008da249a0...
[3] https://github.com/openclaw/openclaw/commit/506b0bbaad312454...
[4] https://github.com/openclaw/openclaw/commit/512f777099eb19df...
[5] https://github.com/openclaw/openclaw/blob/main/CHANGELOG.md
That's why my original comment said:
> (Whether or not you trust the quality of the software you can't deny the impact it had in such a short time. It defined a new category of software.)
I brought up OpenClaw here because the challenge was:
> we still have no companies compressing 10 years into 1 year thus exploding past all the incumbents who don't "get it".
Seriously? Commit count is right up there with lines of code as a classically dumb measurement of productivity.
Sure, but it's still a good counter to "a month of work".
No it isn't. There's basically no upper bound on the number of commits an LLM can generate. If the LLM takes 10,000 commits to do what a human would do in 10, then the comparison is meaningless.
I don't know anything about the code quality of OpenClaw, but telling me the number of commits tells me precisely nothing of use.
OK, now do that for 369,293 stars, 76,193 forks, 138 releases and 2,133 contributors.
I expect there is no number I could bring up here that won't be instantly shot down as telling "precisely nothing". My mistake for bringing up any numbers at all.
OpenClaw is a good example of a completely new project written using coding agents that made a significant impression on the world and would not have been built without them.
I'm surprised this is a hill I have to die on, but there we are.
(I'm not even a user of OpenClaw! I don't think it's secure or safe enough to use in my own life.)
It isn’t man. Anyone can easily split a single good commit into 10 just to inflate the numbers. C’mon, this is 101 git
You're framing it like the only barrier to writing wildly successful money printing software is software development skills.
If that were true, all of these anti-AI greybeards who have been in the game for 30 years would all own their own jets.
Ideally, the given example would be something not ajacent to the presently white-hot category of "AI agents".
Like, look at e.g. YC minus the AI and AI ajacent companies. Are those startups meaningfully more impressive or feature-rich as compared to a couple years ago?
Not yet, no. I think that's because coding agents got good in November, most people didn't notice until January and it still takes 3-4 months to go from idea to releasing something.
I expect we will start seeing the impact of the new coding agent enhanced development processes over the next few months.
> It defined a new category of software
Which is exactly why you can't use it as an example, there is no control. This is basic stuff.
The condensation argument is totally true.... Strikes me though the other metric Id look at is how long code survives before being re-written. Feels like for that one a bit early to tell...
I don’t see OpenClaw making much of an impact. Maybe in your bubble?
There are credible reports of regular people in China attending dedicated events for help getting started with OpenClaw. They're not in my bubble!
https://www.reuters.com/technology/openclaw-enthusiasm-grips...
Its trash vibecoded markdown files around pi. This exemplifies well what op’is saying. We are at the ICO stage of llms. Hopefully there wont be an nft one
As much as I love to hate on AI: even the bad apples still produce something that one can reasonably work with.
Cryptocurrencies? Barely any other use than money laundering, buying drugs and betting on the outcome of battles in war. And NFTs? No use at all other than money laundering and setting money ablaze.
Privacy and security from government overreach is not enough?
What privacy? Enough drug dealers have already been busted with solid evidence from trailing the paths on public blockchains.
The thing is, I don't care any longer. I sincerely believe velocity without direction is not a good strategy for delivering quality in the long term. And that's the thing about it: How sustainable is this velocity, in terms of socioeconomic concerns, product strategy, and mental health?
Velocity without direction?
I‘m personally directing and QA testing every feature.
I don’t know how socioeconomic concerns, product strategy, and mental health are a concern for me here.
I‘m having a great time with my project and it’s been the most fun I‘ve had in many years of building.
All of the "solo green field projects" I let LLMs mostly write, despite supplying the scaffolding, structure and specific implementation details as code, prompts or context, I can't tell you much about 6+ months later, except for the parts I did write.
It's like I never wrote them, because I didn't. I've got the gist of them, but it's the same way I get the gist of something like Numpy: I know how it works theoretically, but certainly not specifically enough to jump in and write some working Fortran that fixes bugs or adds features.
I now have a bunch of stalled projects I'm not very familiar with. I no longer do solo green field projects that way.
> and it can now multiply your productivity fivefold if it’s a solo greenfield project.
Why do I not see 5x as many interesting greenfield projects than before?
> if it’s a solo greenfield project
That's a big if. I don't have numbers but most professional engineers are not working on such projects