Don't fall into the "Look ma, no hands" hype.
Antirez + LLM + CFO = Billion Dollar Redis company, quite plausibly.
/However/ ...
As for the delta provided by an LLM to Antirez, outside of Redis (and outside of any problem space he is already intimately familiar with), an Apples to Apples comparison would be he trying this on an equally complex codebase he has no idea about. I'll bet... what Antirez can do with Redis and LLMs (certainly useful, huge Quality of Life improvement to Antirez), he cannot even begin to do with (say) Postgres.
The only way to get there with (say) Postgres, would be to /know/ Postgres. And pretty much everyone, no matter how good, cannot get there with code-reading alone. With software at least, we need to develop a mental model of the thing by futzing about with the thing in deeply meaningful ways.
And most of us day-job grunts are in the latter spot... working in some grimy legacy multi-hundred-thousand line code-mine, full of NPM vulns, schelpping code over the wall to QA (assuming there is even a QA), and basically developing against live customers --- "learn by shipping", as they say.
I do think LLMs are wildly interesting technology, however they are poor utility for non-domain-experts. If organisations want to profit from the fully-loaded cost of LLM technology, they better also invest heavily in staff training and development.
Exactly. AI is minimally useful for coding something that you couldn't have been able to code yourself, given enough time, without explicitly investing time in generic learning not specific to that codebase or particular task.
Although calling AI "just autocomplete" is almost a slur now, it really is just that in the sense that you need to A) have a decent mental picture of what you want, and, B) recognize a correct output when you see it.
On a tangent, the inability to identify correct output is also why I don't recommend using LLMs to teach you anything serious. When we use a search engine to learn something, we know when we've stumbled upon a really good piece of pedagogy through various signals like information density, logical consistency, structuredness/clarity of thought, consensus, reviews, author's credentials etc. But with LLMs we lose these critical analysis signals.
I've been trying to articulate this exact point. The problem w/ LLM's is that at times they are very capable but always unreliable.
Absolutely spot on.
You are calling out the and subtle nuance that many don’t get…
You could have another LLM tell you which is the correct output.
And when the whole world is covered in datacenters, how will we continue to scale?
Just try to focus on all the good it will bring.
... and then a third one to check wether the second one was right. then a forth one to... o wait
> And pretty much everyone, no matter how good, cannot get there with code-reading alone. With software at least, we need to develop a mental model of the thing by futzing about with the thing in deeply meaningful ways
LLMs help with that part too. As Antirez says:
Writing code is no longer needed for the most part. It is now a lot more interesting to understand what to do, and how to do it (and, about this second part, LLMs are great partners, too).
How to "understand" what to do?
How to know the "how to do it" is sensible? (sensible = the product will produce the expected outcome within the expected (or tolerable) error bars?)
> How to "understand" what to do?
How did you ever know? It's not like everyone always wrote perfect code up until now.
Nothing has changed, except now you have a "partner" to help you along with your understanding.
Well, I have a whole blog post of an answer for you: https://www.evalapply.org/posts/tools-for-thought/
Who "knows"?
It's who has a world-model. It's who can evaluate input signal against said world-model. Which requires an ability to generate questions, probe the nature of reality, and do experiments to figure out what's what. And it's who can alter their world-model using experiences collected from the back-and-forth.
Yes most c-level executives (who often have to report to a board) have tendencies to predict the future after using claude code. It didn't happen in 2025 yet they still insist. While their senior engineers are still working at the production code.
What "domain expert" means is also changing however.
As I've mentioned often, I'm solving problems in a domain I had minimal background in before. However, that domain is computer vision. So I can literally "see" if the code works or not!
To expand, I've set up tests, benchmarks and tools that generate results as images. I chat with the LLM about a specific problem at hand, it presents various solutions, I pick a promising approach, it writes the code, I run the tests which almost always pass, but if they don't, I can hone in on the problem quickly with a visual check of the relevant images.
This has allowed me to make progress despite my lack of background. Interestingly, I've now built up some domain knowledge through learning by doing and experimenting (and soon, shipping)!
These days I think an agent could execute this whole loop by itself by "looking" at the test and result images itself. I've uploaded test images to the LLM and we had technical conversations about them as if it "saw" them like a human. However, there are ton of images and I don't want to burn the tokens at this point.
The upshot is, if you can set up a way of reliably testing and validating the LLM's output, you could still achieve things in an unfamiliar domain without prior expertise.
Taking your Postgres example, it's a heavily tested and benchmarked project. I would bet someone like Antirez would be able to jump in and do original, valid work using AI very quickly, because even if hasn't futzed with Postgres code, he HAS futzed with a LOT of other code and hence has a deep intuition about software architecture in general.
So this is what I meant by the meaning of "domain expert" changing. The required skills have become a lot more fundamental. Maybe the only required skills are intuition about software engineering, critical thinking, and basic knowledge of statistics and the scientific method.
I'm not sure the blog post goes in the opposite direction of what you say, in fact he points out that the quality of the output depends on the quality of the hints, which implies that quality hints require quality understanding from the user.
if you are very high up the chain like Linus, i think doing vibe coding gives you more feedback than any average dev. So they are having a positive feedback loop.
For most of us vibe coding gives 0 advantage. Our software will just sit there and get no views and producing it faster means nothing. In fact, it just scares us that some exec is gonna look at this and write us for low performance because they saw someone do the same thing we are doing in 2 days instead of 4.
Less a 'chain' or hierarchy than a lecture hall with cliques. Many of the 'influencers', media personalities, infamous, famous, anyone with a recognizable name - for the most part - was introduced to the tsunami wave of [new tech] at the same time. They may come with advantages, but it's how they get back to the 'top' (for your chain) vs. staying up there.
For a while now I've felt that there's an apathy in: there's more content being created than consumed.
this is true, like 90% of projects submitted on product hunt have 1 vote or less.
I've set the bar so low that getting a reply to that was already unexpected.
There is a lot of "attention" to go around for small group interactions like this subthread. Like a bar chat I guess.
Lmao, me too, the internet has become a single player game at this point. I usually just type and forget.
Except that Linus does basically zero programming these days. He's a manager, combining code from the subsystem managers below him into a final release.
That's wrong, he is coding, well, vibecoding.
https://github.com/torvalds/AudioNoise
Right, but Linus also has an extremely refined mental model of the project he maintains, and has built up a lot of skills reading code.
Most engineers in my experience are much less skillful at reading code than writing code. What I’ve seen so far with use of LLM tools is a bunch of minimally edited LLM produced content that was not properly critiqued.
Here's some of the code antirez described in the OP, if you want to see what expert usage of Claude Code looks like: https://github.com/antirez/linenoise/commit/c12b66d25508bd70... and https://github.com/antirez/linenoise/commit/a7b86c17444227aa...
This looks more worrying than impressive. It's long files of code with if-statements and flag-checking unicode bit patterns, with an enormous number of potential test-cases.
It's not conceptually challenging to understand, but time consuming to write, test, and trust. Having an LLM write these types of things can save time, but please don't trust it blindly.
I see dividing the tests and code into two different changes is pretty nice, In fact I have been using double agent thing where one is writing tests and other is writing the code, solves the attention issue also. Although the code itself looks harder to read, but that is probably more on me than Claude.
>> ...however they are poor utility for non-domain-experts.
IDK, just two days ago I had a bug report/fix accepted by a project which I would have never dreamt of digging into as what it does is way outside my knowledge base. But Claude got right on in there and found the problem after a few rounds of printf debugging which lead to an assertion we would have hit with a debug build which led to the solution. Easy peasy and I still have no idea how the other library does its thing at all as Claude was using it to do this other thing.
Keep believing. To the bitter end. For such human slop codebases AI slop additions will do equally fine. Add good testing and the code might even improve over the garbage that came before.
Generating also the tests happens a little bit too often for any kind of improvement. simonw posted here a generated “something” the other day, which he didn’t know whether it’s really working or not, but he was happy that his generated, completely unchecked tests are green, and yet some other root commenter here praises him.
It needs a lot of work to not be skeptical, when when I try it, it generates shit, especially when I want something completely new, not existing anywhere, and also when these people when they show how they work with it, it always turns out that it’s on the scale of terrible to bad.
I also use AI, but I don’t allow it to touch my code, because I’m disgusted by its code quality. I ask it, and sometimes it delivers, but mostly not.
Which thing was that?
(If you need help finding it try visiting https://tools.simonwillison.net/hn-comments-for-user and searching for simonw - you can then search my 1,000 most recent comments in one place.)
If my tests are green then it tells me a LOT about what the software is capable of, even if I haven't reviewed every line of the implementation.
The next step is to actually start using it for real problems. That should very quickly shake out any significant or minor issues that sneaked past the automated tests.
I've started thinking about this by comparing it to work I've done within larger companies. My team would make use of code written by other teams without reviewing everything those other teams had written. If their tests passed we would build against their stuff, and if their stuff turned out not to work we would let them know or help debug and fix it ourselves.