+1

I’ve been driving Claude as my primary coding interface the last three months at my job. Other than a different domain, I feel like I could have written this exact article.

The project I’m on started as a vibe-coded prototype that quickly got promoted to a production service we sell.

I’ve had to build the mental model after the fact, while refactoring and ripping out large chunks of nonsense or dead code.

But the product wouldn’t exist without that quick and dirty prototype, and I can use Claude as a goddamned chainsaw to clean up.

On Friday, I finally added a type checker pre-commit hook and fixed the 90 existing errors (properly, no type ignores) in ~2 hours. I tried full-agentic first, and it failed miserably, then I went through error by error with Claude, we tightened up some exiting types, fixed some clunky abstractions, and got a nice, clean result.

AI-assisted coding is amazing, but IMO for production code there’s no substitute for human review and guidance.

My process: start ideating and get the AI to poke holes in your reasoning, your vision, scalability, etc. do this for a few days while taking breaks. This is all contained in one Md file with mermaid diagrams and sections.

Then use ideation to architect, dive into details and tell the AI exactly what your choices are, how certain methods should be called, how logging and observability should be setup, what language to use, type checking, coding style (configure ruthless linting and formatting before you write a single line of code), what testing methodology, framework, unit, integration, e2e. Database, changes you will handle migrations, as much as possible so the AI is as confined as possible to how you would do it.

Then, create a plan file, have it manage it like a task list, and implement in parts, before starting it needs to present you a plan, in it you will notice it will make mistakes, misunderstand some things that you may me didn’t clarify before, or it will just forget. You add to AGENTS.md or whatever, make changes to the ai’s plan, tell it to update the plan.md and when satisfied, proceed.

After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff.

The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.

If you do this and iterate you will gradually end up with a solid harness and you will need to review less.

Then port it to other projects.

> After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff. > > The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.

For that I usually get it reviewed by LLMs first, before reviewing it myself.

Same model, but clean session, different models from different providers. And multiple (at least 2) automated rounds of review -> triage by the implementing session -> addressing + reasons for deferring / ignoring deferred / ignored feedbacks -> review -> triage by the implementing session -> …

Works wonders.

Committing the initial spec / plan also helps the reviewers compare the actual implementation to what was planned. Didn’t expect it, but it’s worked nicely.

LISTEN/NOTIFY is not brittle, we use it for millions of events per day.

I agree! It should be very stable, IMO. If not, then please send a bug report and we'll look into it. Also, now it scales well with the number of listening connections (given clients listen on unique channel names): https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

The LISTEN/NOTIFY feature really just doesn’t get enough PR. It is perfectly suitable for production workloads yet people still want to reach for more complicated solutions they don’t need.

It's not the feature itself, it's how/what the llm tries to use it for. It uses it to cross any and all architectural boundaries.

[dead]

I find it very interesting that you assume this method would branch out to other projects. I find it even more interesting that you assume all software codebases use a database, give a damn about async anything, and that these ideas percolate out to general software engineering.

Sounds like a solid way to make crud web apps though.

GP is clearly providing examples of categories of tasks. Sure, not all languages do “async fn foo()”, but almost all problem domains involve some sort of making sure the right things happen at the right times, which is in a similar ballpark.

Holier than thou “yeah well I work on stuff that doesn’t use databases, checkmate!” doesn’t really land - data still gets moved around somehow, and often over a network!

Not trying to "land" anything.

I’ve found that LLMs will frequently do extremely silly things that no person would do to make typescript code pass the typechecker.

I've noticed this too, but not necessarily type checkers, but more with linters. And can't really figure out if there's even a way to solve it.

If you set up restrictive linters and don't explicitly prohibit agents from adding inline allows, most LOC will be allow comments.

Based on this learning, I've decided to prohibit any inline allows. And then agents started doing very questionable things to satisfy clippy.

Recent example:

- Claude set up a test support module so that it could reuse things. Since this was not used in all tests, rust complained about dead_code. Instead of making it work, claude decided to remove test support module and just... blow up each test.

If you enable thinking summaries, you'll always see agent saying something like: "I need to be pragmatic", which is the right choice 50% of the time.

You need to very specific and also question the output if it does something insane

This decade’s version of “works on my box”

I've found it's less about specificity and more about removing the # of critical assumptions it needs to make. Being too specific can be a hindrance in it's own regard.

And that's also a decent barometer for what it's good at. The more amount of critical assumptions AI needs to make, the less likely it is to make good ones.

For instance, when building a heat map, I don't have to get specific at all because the amount of consequential assumptions it needs to make is slim. I don't care or can change the colors, or the label placement.

I caught it using Parameters<typeof otherfn>[2] the other day. It wanted to avoid importing a type, so it did this nonsense. (I might have the syntax slightly wrong here, I'm writing from memory.)

But it's not all bad news. TIL about Parameters<T>.

Yeah, I've found LLMs cannot write good Typescript code period. The good news is that they are excellent at some other languages.

I can't agree here. https://pelorus-nav.com/ (one of my side projects) is 95-98% written by Claude Opus 4.6, all in very nice typescript which I carefully review and correct, and use good prompting and context hygiene to ensure it doesn't take shortcuts. It's taken a month or so but so worth it. And my packing list app packzen.org is also pretty decent typescript all through.

> which I carefully review and correct

So you do agree? If you are having to review and correct then it's not really the LLM writing it anymore. I have little doubt that you can write good Typescript, but that's not what I said. I said LLMs cannot write good Typescript and it seems you agree given your purported actions towards it. Which is quite unlike some other languages where LLMs write good code all the time — no hand holding necessary.

I think it can write working TypeScript code, and it can write good TypeScript code if it is guided by a knowledgable programmer. It requires actually reviewing all the code and giving pointed feedback though (which at that point is only slightly more efficient than just writing it yourself).

> It requires actually reviewing all the code and giving pointed feedback though

Exactly. You can write good Typescript, no doubt, but LLMs cannot. This is not like some other languages where LLM generated code is actually consistently good without needing to become the author.

Fwiw, the article mirrors my experience when I started out too, even exactly with the same first month of vibecoding, then the next project which I did exactly like he outlined too.

Personally, I think it's just the natural flow when you're starting out. If he keeps going, his opinion is going to change and as he gets to know it better, he'll likely go more and more towards vibecoding again.

It's hard to say why, but you get better at it. Even if it's really hard to really put into words why

Given how addictive vibecoding is, I think it's very hard to be objective about the results if you are involved in the process.

It's a little like asking a cokehead how the addiction is going for him while he is high. Obviously he's going to say it's great because the consequences haven't hit him. Some percentage of addicts will never realize it was a problem at all.

Its not random that AI happens to be built by the very same people that turned internet forums into the most addictive communication technology ever.

> he'll likely go more and more towards vibecoding again

I think "more and more" is doing some very heavy lifting here. On the surface it reads like "a lot" to many people, I think, which is why this is hard to read without cringing a bit. Read like that it comes off as "It's very addictive and eventually you get lulled into accepting nonsense again, except I haven't realized that's what's happening".

But the truth is that this comment really relies entirely on what "more and more" means here.

You can’t put it into words? Why? Perhaps you haven’t looked at it objectively?

It may actually be true. Your feeling might be right - but I strongly caution you against trusting that feeling until you can explain it. Something you can’t explain is something you don’t understand.

really?

have you ever learned a skill? Like carving, singing, playing guitar, playing a video game, anything?

It's easy to get better at it without understanding why you're better at it. As a matter of fact, very very few people master the discipline enough to be able to grasp the reason for why they're actually better

Most people just come up with random shit which may or may not be related. Which I just abstained from.

I've learned a number of skills, and for me none of them worked in the way you're describing. I didn't learn to cut good miter joints by randomly vibe-sawing wood until I unlocked miter joints in the skill tree. I carefully studied the errors I made, and adjusted in ways I thought might correct them, some of which helped some of which did not. Then eventually I understood the relationship between my actions and the underlying principles in enough detail to consistently hit 45 degrees.

Isn't that example pretty reductive, in that you have a directly-measurable output? I mean, the joint is either 45° (well, 90°) or it's not. Zoom out a bit, and the skill-set becomes much less definable: are my cabinets good - for some intersection of well-proportioned, elegantly-finished, and fit for purpose, with well-chosen wood and appropriate hardware.

Mind you, I don't think the process of improvement in those dimensions is fundamentally different, just much less direct and not easily (or perhaps even at all) articulable.

You can get better at something without understanding why, but you should be able to think about it and determine why fairly easily.

This is something everyone who cares about improving in a skill does regularly - examine their improvement, the reasons behind it, and how to add to them. That’s the basis of self-driven learning.

This is an absurd statement. There are many complex undertakings in sport where even the very best get better with practice and can't tell you why. In fact, the ones who think they can tell you why are the one's to be most skeptical of.

You are just making stuff up or regurgitating material from a pop science book.

Instead of accusing others of making things up, perhaps step back and re-evaluate the conversation you're taking part in. In this instance, it appears that you misunderstood or skipped over the word "learning".

They can't tell you (not everyone is eloquent), but they sure know why. Struggling to put something in word is not the same as not knowing.

Much of human behavior is evolved so that we don't understand why. For example human morality is an evolved trait, but you wouldn't know it.

Please explain walking to me so that I can explain it to a person who forgot how to walk such that he can walk after the explanation.

Nope, they don't.

Not really. I can obviously say something, like you learn which features the models are able to actually implement, and you learn how to phrase and approach trickier features to get the model too do what you want.

And that's not really explainable without exploring specific examples. And now we're in thousands of words of explanation territory, hence my decision to say it's hard to put it into words.

I think you’re handwaving away vague, ungrounded intuition and calling it learning.

For instance, if I say “I noticed I run better in my blue shoes than my red shoes” I did not learn anything. If I examine my shoes and notice that my blue shoes have a cushioned sole, while my red shoes are flat, I can combine that with thinking about how I run and learn that cushioned soles cause less fatigue to the muscles in my feet and ankles.

The reason the difference matters is because if I don’t do the learning step, when buy another pair of blue shoes but they’re flat soled, I’m back to square one.

Back to the real scenario, if you hold on to your ungrounded intuition re what tricks and phrasing work without understanding why, you may find those don’t work at all on a new model version or when forced to change to a different product due to price, insolvency, etc.

You're always free to stop at the level of abstraction at which you find a certain answer to be satisfying, but you can also keep digging. Why are flat shoes better? Well, it's to do with my gait. Ok, but why is my gait like that? Something-something musculoskeletal. Why is my body that way? Something-something genetic. OK, but why is that? And so on.

Pursued far enough, any line of thought will reach something non-deterministic - or, simply, That's The Way It Is - however unsatisfying that is to those of us who crave straightforward answers. Like it or not, our ground truth as human beings ultimately rests on intuition. (Feel free to say, "No, it's physics", or "No, it's maths", but I'll ask you if you're doing those calculations in your head as you run!)

It is very silly to treat zero grounding the same as accepting core, proven concepts. Your PoV here is no different than saying "It rains because god is sad and crying" is an appropriate thing to believe.

If you want to say "god is responsible for creating the precipitation cycle", sure. But we don't disregard understanding that exists to substitute intuition.

[deleted]