He’s not wrong.
Getting 80% of the benefit of LLMs is trivial. You can ask it for some functions or to write a suite of unit tests and you’re done.
The last 20%, while possible to attain, is ultimately not worth it for the amount of time you spend in context hells. You can just do it yourself faster.
> The last 20%, while possible to attain, is ultimately not worth it for the amount of time you spend in context hells. You can just do it yourself faster.
I'm arguing that there's a skill that has to be learned in order to break through this. As you start in a new code base, you should be quick to jump in when you hit that 20%. But, as you spend more time in it, you learn how to avoid the same "context hell" issues and move that number down to 15%, 10%, 5% of the time.
You're still going to need to jump in, but when you can learn to get the LLM to write 95% of the code for you, that's incredibly powerful.
It’s not incredibly powerful, it’s incrementally powerful. Getting the first 80% via LLM is already the incredible power. A sufficiently skilled developer should be able to handle the rest with ease. It is not worth doing anything unnatural in an effort to chase down the last 20%, you are just wasting time and atrophying skills. If you can get full 95% in some one shot prompts, great. But don’t go chasing waterfallls.
No, it actually has an exponential growth type of effect on productivity to be able to push it to the boundary more.
I’m making this a bit contrived, but I’m simplifying it to demonstrate the underlying point.
When an LLM is 80% effect, I’m limits to doing 5 things in parallel since I still need to jump in 20% of the time.
When an LLM is 90% effect, I can do 10 things at once. When it’s 95%, 20 things. 99%, 100 things.
Now, obviously I can’t actually juggle 10 or 20 things at once. However, the point is there are actually massive productivity gains to be had when you can reduce your involvement in a task from 20% to, even 10%. You’re effectively 2x as productive.
I’d bet you don’t even have 2 or 3 things to do at once, much less 100. So it’s pointless to chase those types of coverages.
Do you understand what parallel means? Most LLM responds in seconds, there is no parallel work for you to do there.
Or do you mean you are using long running agents to do tasks and then review those? I haven't seen such a workflow be productive so far.
I run through a really extensive planning step that generates technical architecture and iterative tasks. I then send an LLM along to implement each step, debugging, iterative, and verifying it's work. It's not uncommon for it to take a non-trivial amount of time to complete a step (5+ minutes).
Right now, I still need to intervene enough that I'm not actually doing a second coding project in parallel. I tend to focus on communication, documentation, and other artifacts that support the code I'm writing.
However, I am very close to hitting that point and occasionally do on easier tasks. There's a _very_ real tipping point in productivity when you have confidence that an LLM can accomplish a certain task without your intervention. You can start to do things legitimately in parallel when you're only really reviewing outputs and doing minor tweaks.
> 'm arguing that there's a skill that has to be learned in order to break through this. As you start in a new code base, you should be quick to jump in when you hit that 20%. But, as you spend more time in it, you learn how to avoid the same "context hell" issues and move that number down to 15%, 10%, 5% of the time.
The problem is that you're learning a skill that will need refinement each time you switch to a new model. You will redo some of this learning on each new model you use.
This actually might not be a problem anyway, as all the models seem to be converging asymptotically towards "programming".
The better they do on the programming benchmarks, the further away from AGI they get.
exactly. people delude themselves thinking this is productivity. Tweaking prompts is to get it "right" is very wasteful.