I'm a huge fan of Claude Code. That being said it blows my mind people can use this at a higher level than I do. I really need to approve every single edit and keep an eye on it at ALL TIMES, otherwise it goes haywire very very fast!
How are people using auto-edits and these kind of higher-level abstraction?
The secret to being an elite 10x dev - push 1000's of lines of code, soak up the ooo's and ahhh's at the standup when management highlight your amazingly large line count, post to linkedin about how great and humble you are, then move to the next role before anyone notices you contribute nothing but garbage and some loser 0.1x dev has to spend months fixing something they could have writting in a week or two from scratch.
This has been my experience with coworkers who are big vibe coders as well. Another “sorry, big PR coming in that needs a review” and I’m gonna lose it. 50 comments later and they still don’t change.
When using agents like this, you only see a speedup because you’re offloading the time you’d spend thinking / understanding the code. If you can review code faster than you can write it, you’re cutting corners on your code reviews. Which is normally fine with humans (this is why we pay them), but not AI. Most people just code review for nitpicks anyways (rename a variable, add some white space, use map reduce instead of for each) instead of taking time to understand the change (you’ll be looking a lots of code and docs that aren’t present in the diff).
That is, unless you type really slowly - which I’ve recently discovered is actually a bottle neck for some professionals (slow typing, syntax issues, constantly checking docs, etc). I’ll add I experience this too when learning a new language and AI is immensely helpful.
You're absolutely right but I wonder if we'll have to ditch the traditional code review for something else, perhaps automated, if this agentic way continues.
> You're absolutely right
Claude! Get off HN and get back to work.
Oh my, that was unintentional. What have I become...
AI can actually review PRs decently when given enough context and detailed instructions. It doesn't eliminate the PR problem, but it can catch a lot of bugs and it can add comments to parts of the code that look questionable to instruct humans to manually verify.
You can also force the agent to write up a summary of the code change, reasoning, etc, fortunately, which can help with review.
Which industry are you in, that there is 1:1 ratio of coding to review hours?
SaaS company. And there absolutely isn’t - which is why we pay for devs. Mostly due to trust we’re able to review human PRs faster than equivalent AI PRs. Trust is worth a lot of money.
This. I‘m always amazed on how LLMs are praised for being able to churn out the large amount of code we apparently all need.
I keep wondering why. All projects I ever saw need lines of code, nuts and bolts removed instead of added. My best libraries consist of a couple of thousand lines.
LLMs are a godsend when it comes to developing things that fit into one of the tens of thousands (or however many) of templates they have memorized. For instance, a lot of modern B2B software development involves updating CRUD interfaces and APIs to data. If you already have 50 or so CRUD functions in an existing layered architecture implemented, asking an LLM to implement the 51st, given a spec, is a _huge_ time-saver. Of course, you still need to use your human brain to verify before hand that there aren't special edge cases that need to be considered. Sometimes, you can explain the edge cases to the LLM and it will do a perfect job of figuring them out (assuming you do a good job of explaining it, and it's not too complicated). And if there aren't any real edge cases to worry about, then the LLM can one-shot a perfect PR (assuming you did the work to give it the context).
Of course, there are many many other kinds of development - when developing novel low-level systems for complicated requirements, you're going to get much poorer results from an LLM, because the project won't as neatly fit in to one of the "templates" that it has memorized, and the LLM's reasoning capabilities are not yet sophisticated enough to handle arbitrary novelty.
The elephant in the room though is that the vast majority of programming fits into the template style that LLM’s are good at. That’s why so many people are afraid of it.
Yes - and I think those people need to expand their skillsets to include the things that the LLMs _cannot_ (yet) do, and/or expand their productivity by wielding the LLMs to do their work for them in a very efficient manner.
I think Steve Ballmer's quote was something like "Measure a software project's progress by increase in lines-of-code is like measuring an airplane project's progress by increase in weight."
There's a lot of smart people on HN who are good coders and would do well to stop listening to this BS.
Great engineers who pick up vibe coding without adopting the ridiculous "it's AI so it can't be better than me" attitude are the ones who are able to turn into incredibly proficient people able to move mountains in very little time.
People stuck in the "AI can only produce garbage" mindset are unknowingly saying something about themselves. AI is mainly a reflection of how you use it. It's a tool, and learning how to use that tool proficiently is part of your job.
Of course, some people have the mistaken belief that by taking the worst examples of bullshit-coding and painting all vibe coders with that same brush, they'll delay the day they lose their job a a tiny bit more. I've seen many of those takes by now. They're all blind and they get upvoted by people who either haven't had the experience (or correct setup) yet, or they're in pure denial.
The secret? The secret is that just as before you had a large amount of "bad coders", now you also have a large amount of "bad vibe coders". I don't think it's news to anyone that most people tend to be bad or mediocre at their job. And there's this mistaken thinking that the AI is the one doing the work, so the user cannot be blamed… but yes they absolutely can. The prompting & the tooling set up around the use of that tool, knowing when to use it, the active review cycle, etc - all of it is also part of the work, and if you don't know how to do it, tough.
I think one of the best skills you can have today is to be really good at "glance-reviews" in order to be able to actively review code as it's being written by AI, and be able to interrupt it when it goes sideways. This is stuff non-technical people and juniors (and even mediors) cannot do. Readers who have been in tech for 10+ years and have the capacity to do that would do better to use it than to stuff their head in the sand pretending only bad code can come out of Claude or something.
You can't, at least for production code. I have used Claude Code for vibe coding several side projects now, some just for fun, others more serious and need to be well written and maintainable. For the former, as long as it works, I don't care, but I could easily see issues like dependency management. Then for the latter, because I actually need to personally verify every detail of the final product and review (which means "scan" at the least) the code, I always see a lot of issues -- tightly coupled code that makes testing difficult, missing test cases, using regex when it shouldn't, having giant classes that are impossible to read/maintain. Well, many of the issues you see humans do. I needed to constantly interrupt and ask it to do something different.
> You can't, at least for production code.
You can. People do. It's not perfect at it yet, but there are success stories of this.
Are you talking about the same thing as the OP?
I mean, the parent even pointed out that it works for vibe coding and stuff you don't care about; ...but the 'You can't' refers to this question by the OP:
> I really need to approve every single edit and keep an eye on it at ALL TIMES, otherwise it goes haywire very very fast! How are people using auto-edits and these kind of higher-level abstraction?
No one I've spoken to is just sitting back writing tickets while agents do all the work. If it was that easy to be that successful, everyone would be doing it. Everyone would be talking about it.
To be absolutely clear, I'm not saying that you can't use agents to modify existing code. You can. I do; lots of people do. ...but that's using it like you see in all the demos and videos; at a code level, in an editor, while editing and working on the code yourself.
I'm specifically addressing the OPs question:
Can you use unsupervised agents, where you don't interact at a 'code' level, only at a high level abstraction level?
...and, I don't think you can. I don't believe anyone is doing this. I don't believe I've seen any real stories of people doing this successfully.
> Can you use unsupervised agents, where you don't interact at a 'code' level, only at a high level abstraction level?
My view, after having gone all-in with Claude Code (almost only Opus) for the last four weeks, is ”no”. You really can’t. The review process needs to be diligent and all-encompassing and is, quite frankly, exhausting.
One improvement I have made to my process for this is to spin up a new Claude Code instance (or clear context) and ask for a code review based on the diff of all changes. My prompt for this is carefully structured. Some issues it identifies can be fixed with the agent, but others need my involvement. It doesn’t eliminate the need to review everything, but it does help focus some of my efforts.
Do you know of any links to writeups (or just mentions) of this?
Check out the_mitsuhiko’s youtube, he has been showing some good techniques in the past few weeks.
I don't trust Armin for that, he's too good a developer for vibe coding. The question is whether someone who can't program at all can make something that works well with LLMs, not whether Armin can.
Is that the question? I definitely don't think that's remotely reasonable for someone who can't program. For small things yes, but large things? They're going to get into a spin cycle with the LLM on some edge case it's confused about where they consistently say "the button is blue!" and the bot confirms it is indeed not blue.
It really depends on the area though. Some areas are simple for LLMs, others are quite difficult even if objectively simple.
Granted atm i'm not a big believer in vibe coding in general, but imo it requires quite a bit of knowledge to be hands off and not have it fall into wells of confusion.
That's what I understood from the top-level question, and it's my experience as well. If you don't review the LLM's code, it breaks very quickly. That's why the question for me isn't "how many agents can I run in parallel?", but "how many changes can I review in parallel?".
For me, that's "just one", and that's why LLM coding doesn't scale very far for me with these tools.
if you have to understand the code, it's not vibe coding. Karpathy's whole tweet was about ignoring the code.
if you have to understand the code to progress, it's regular fucking programming.
I don't go gushy about code generation when I use yasnippet or a vim macro, why should super autocomplete be different?
this is an important distinction because if Karpathy's version becomes real we're all out of a job, and I'm sick of hearing developers role play publicly towards leaders that their skills aren't valuable anymore
I disagree, i think there's degrees of governance that these concepts cover. It's all subjective of course. For me though, i've "vibe coded" projects (as testing grounds) with minimal review, but still used my programming experience to shape the general architecture and testing practices to what i thought would best fit the LLM.
The question is how much do you review, and how much does your experience help it? Even if you didn't know code you're still going to review the app. Ideally incrementally or else you won't know what's working and what isn't. Reviewing the technical "decisions" from the LLM is just an incremental step towards reviewing every LOC. There's a large gulf between full reviews and no reviews.
Where in that gulf you decide to call it "vibe coding" is up to you. If you only consider it vibing if you never look at the code though, then most people don't vibe code imo.
I think of "vibe coding" as synonymous with "sloppy/lazy coding". Eg you're skipping details and "trusting" that the LLM is either correct or has enough guardrails to be correct in the impl. How many details you skip though is variable, imo.
> The question is whether someone who can't program at all can make something that works well with LLMs
Is that where the goalposts are now?
No, this is where this discussion is, since the top comment. Please go elsewhere for straw men.
No one mentioned anything about "people who can't program at all" until your comment. Up until then the discussion was about using LLMs for production ready code. It's a given that people working on production systems know how to program.
That is what "auto-edits" (in the first comment) and "vibe coding" (in the second comment) mean.
There are few writeups but if you go to agentic coding meetups you can find people that show the stuff the build. It’s really quite impressive.
Do they go into the architecture and code structure, or more just the user-facing result? Coding agents do a lot of copy-paste or near equivalents and make way too much code to accomplish many things.
Ah, we don't have any such meetups where I am... Are these from people who can't program at all?
Also, yes. I wrote about this a bit here: https://lucumr.pocoo.org/2025/7/20/the-next-generation/
There are a lot of people who are entering programming via this thing.
Sure, but when I tried to vibe code something in a language I didn't have experience with, and so didn't look at the code at all, I had to basically trash the codebase after a few hundred lines because nothing I'd say could make the LLM fix the problems, or if it fixed them, more problems would pop up elsewhere.
In my experience, if you can't review the code and point out the LLM's mistakes to it, the codebase gets brittle fast. Maybe other people are better vibe coders than me, but I never managed to solve that problem, not even with Opus 4.1.
Deep down you know the answer already :)
There is no magic way. It boils down to less strict inspection.
I try to maintain an overall direction and try to care less about the individual line of code.
Same, I manually approve and steer each operation. I don't see how cleaning up and simplifying after the fact is easier or faster.
That kills iteration speed. Carefully outline, get it to write tests first, then let it go wild and verify the tests while it's doing that. If there are test issues, tell it so without interrupting it, and it'll just queue up those fixes without having to stop and be re-routed.
You want to periodically have coverage improvement -> refactor loops after implementing a few features. You can figure out the refactors you want while the agent is implementing the code, after you've sussed out any test issues, then just queue up instructions on how to refactor once the tests are passing.
Oh boy, they can't. Those are inexperienced vibe coding their idea of how product management could be done but lacking the experience to realize it doesn't work that way. How many of these claude code wrappers have been posted here in the last weeks? Must be a higher two digit.
I think a lot of the AI stuff suffers from being better suited to showing off than actually working. How often have I thought, or worse told my friends, that it one shotted some issue of mine only to realize later that it was only partially working. Devils in the details.
I could not even (as of yesterday) get some boilerplate code out of AI. It very confidently spitted code which would not even compile (multiple times). Yes, it is better than parsing StackoverFlow pages when I have some specific task or error and sometimes slightly better than reading a bunch of docs for a library, but even then I have to verify if it is giving code / examples from latest versions.
You can just tell it to read library code in npm_modules or wherever you have vendored libs in your framework. I for example give it whole demo examples and just say look at @demos/ how to do this. Cursor / CC authors don't add these prompts as this would be costly for them (and they run at loss likely now).
Through multi pass development. It's a bit like how processes happen inside a biological cell. There is no structure there. Structure emerges out of chaos. Same thing is with AI coding tools. Especially Claude code. We are letting code evolve to pass our quality gates. I do get to sit on my hands a lot though which frees up my time.
Yeah, I agree. I never let the AI make any architectural decisions (and I also watch Claude Code like a hawk lol). That being said, since we started using this system, we noticed that our PRDs and implementation plans (epics) became more detailed, giving the AI a lot less wiggle room.
Essentially, I'm treating Claude Code as a very fast junior developer who needs to be spoon-fed with the architecture.
I've seen that happen but usually with code bases that are either not very well documented (reference docs) or that have a lot of abstractions and are complicated
you cant make 90% of your codebase ai generated if you reuse code all the time, just dont abstract.