> getting a 30x to 50x productivity gain

That is an absurd claim.

If you get a 30x gain then you're a 0.05x developer.

a 50x gain would literally mean you could get a year's worth of work done in a week. Preposterous.

Bad/dumb developers don't get much of a boost in my experience working with a plethora of shitty contractors. Good developers aren't getting a 30x boost I don't think, but they are getting more out of the tooling than bad developers.

The bottleneck is still finding good developers, even with the current generation of AI tooling in play.

It was when I started using Github Copilot in "Agent Mode" that my LLM productivity gains went from like 5x to 30x. People who are just using a chatbot get like 5x gains. People who use "Agent Mode" to write up a description of a new feature that would take several days by a human, but get it done in one click by an Agent, are getting 30x or more.

The amount of pushback I got on this thread tells me most devs simply haven't started using actual Agents yet.

I’ve tried using agents. LLMs just can’t reliably accomplish the tasks that I have to do. They just get shit wrong and hallucinate a ton. If I don’t break the task down into tiny chunks then they go off the rails.

This can definitely happen, because the context windows even in a great Agent can become flooded. I often do prompts like "Add a row of buttons at the top right named 'copy', 'cut', and 'paste'", and let the Agent do that, before I implement each button, for example.

The rule of thumb I've learned is to give an Agent the smallest possible task at a time, so there's zero ambiguity in the prompt, and context window is kept small.

One good prompt into Github Copilot 'Agent Mode' (running Claude 4) asking for a new feature can often result in up to 5 to 7 files being generated, and a total of 1000 lines of code being written. Your math is wrong. That's hours of work I didn't do, that only took me the time of describing the new feature with a paragraph of text.

It's ridiculous to equate lines of code to amount of engineering work or value.

A massive amount of valuable work can result in a few lines of code. Conversely a millions lines of code can be useless or even have negative value.

It's all about the quality of your prompts (i.e. your skill at writing clear unambiguous instructions with correct terminologies).

An experienced developer can generate tons of great code 30x faster with an Agent, with each function/module still being written using the least amount of code possible.

But you're right, the measure of good code isn't 'N', it's '1/N' (inverse), where N is number of lines of code to do something. The best code is [almost] always that with the least amount of lines, as long as you haven't sacrificed readability in order to remove lines, which I see a lot of juniors do. Rule of thumb is: "Least amount of easily understood LOC". If someone can't look at your code for the first time, and tell what it's doing, that's normally an indication it's not good code. Claude [almost] never breaks any of these rules.

> Claude [almost] never breaks any of these rules.

Well it does for me, frequently. An example is here: https://news.ycombinator.com/item?id=44126962

Not sure how Claude frequently fails for you, but everybody I know says it rarely fails. I'm definitely not claiming it's perfect tho.

Did you look at the link I sent?

What's your point?

What tech stack are you using? It matters a lot what tech you are using when it comes to how effective the LLMs are.

I'm using VSCode with Github Copilot, which has an "Agent Mode". It proactively reads thru your project files to understand the project, but imo you still have to give it pretty precise instructions to get what you want.