30x productivity gain? gtfo of here.

Most things I try to use it for, it has so many problems with its output that at most I get a 50% productivity gain after fixing everything.

I'm already super efficient at editing text with neovim so honestly for some tasks I end up with a productivity loss.

I can easily get a month of work done in a single day yes. So probably the 30x is about the current max, and 50x was hyperbole, because I didn't add it up before doing that post.

I just don't believe this. It's weird; I just don't know where folks are getting these extreme productivity gains from.

For example, the other day I asked a major LLMs to generate a simple markdown viewer with automatic section indentation for me in Node.js. The basic code worked after a few additional prompts from me.

Now I wanted folding. That was also done by the LLM. And then when I tried to add a few additional simples features, things fell apart. There were one or two seemingly simple runtime errors that the LLM was unable to fix after almost 10 tries.

I could fix it if I started digging inside the code, but then the productivity gains would start to slip away.

I'll spend like 10 minutes crafting a prompt that explains a new feature to be added to my app. I explain it in enough detail, with zero ambiguity, such that any human [senior] developer could do it. Often the result is 100s of lines of code generated, and well over 95% of the time the code "Claude 4" generates is exactly what I wanted.

I'm using VSCode Github Copilot in "Agent Mode", btw. It's able to navigate around an entire project, understand it, and work on it. You just lean back and watch it open files, edit them, show you in realtime what it's thought process is, as it does everything, etc. etc. It's truly like magic.

Any other way of doing development, in 2025, is like being in the stone ages.

Your response does not address the example I gave. Sure, if what you are doing is a variation on something that's been done to death, then an LLM is faster at cutting and gluing boilerplate together across multiple files.

Anything beyond that and LLMs require a lot of hand holding, and frequently regress to boot

I can't tell you how many times I've seen people write shoddy ambiguous prompts and then blame the LLM for not being able to read their minds.

If you write a prompt with perfect specificity as to what you want done, an agent like "Github Copilot+Claude" can work at about the same level as a senior dev. I do it all day long. It writes complex SQL, complex algorithms, etc.

Saying it only does boilerplate well reminds me of my mother who was brainwashed by a PBS TV show into thinking LLMs can only finish sentences they've seen before and cannot reason thru things.

You're still talking past my points. Look at the example I gave. Does it seem like the problem was due to an ambiguous prompt?

Even if my prompt was ambiguous, the LLM has no excuse producing code that does not type-check, or crashes in an obvious way when run. The ambiguity should affect what the code tries to do, not it's basic quality.

And your use of totalizing adjectives like "zero ambiguity" and "perfect specificity" tells me your arguments are somewhat suspect. There's nothing like "zero" and "perfect" as far as architecturing and implementing code goes.

When it comes to zero ambiguity and perfect specificity here's how I define it: If I gave the same exact prompt wording to a human would there be any questions they'd need to ask me before starting the work? If they need to ask a clarifying question before starting then I wasn't clear, otherwise I was clear. If you want to balk at phrases like "perfectly clear" you're just nit picking at semantics.