You're only as fast as your biggest bottleneck. Adding AI to an existing organization is just going to show you where your bottlenecks are, it's not going to magically make them go away. For most companies, the speed of writing code probably wasn't the bottleneck in the first place.
the amount of people that work in technology and have never heard of amdahl's law always shocks me
https://en.wikipedia.org/wiki/Amdahl's_law
a 100% increase in coding speed means I then I get to spend an extra 30 minutes a week in meetings
while now hating my job, because the only fun bit has been removed
"progress"
So if I'm understanding you correctly, prior to AI tools you spent 1 hour per week coding? And now you spend 30 minutes per week?
the number of people who have heard of Amdahl's law but don't know when to use "amount of X" vs "number of Y" always shocks me as well
Agreed. The bottleneck is QA/Code review and that is never going away from most corps. I've never worked at a job in tech that didn't require code review and no, asking a code agent to review a PR is never going to be "good enough".
And here we are, the central argument for why code agents are not these job killing hype beasts that are so regularly claimed.
Has anyone seen what multi-agent code workflows produce? Take a look at openclaw, the code base is an absolute disaster. 500k LoC for something that can be accomplished in 10k.
My head of engineering spent half a day creating a complex setup of agents in opencode, to refactor a data model across multiple repositories. After a day running agents and switching between providers to work around the token limits, it dumped a -20k +30k change set we'll need to review.
If we're very lucky, we'll break even time wise compared to just running a single agent on a tight leash.
While reading your comment the Benny Hill theme Yackety Sax started playing in my head.
YOLO. Just ship it.
> I've never worked at a job in tech that didn't require code review
I have. Sometimes the resulting code was much worse than what you get from an LLM, and yet the project itself was still a success despite this.
I've also worked in places with code review, where the project's own code quality architecture-and-process caused it to be so late to the market it was an automatic failure.
What matters to a business is ideally identical to the business metrics, which are usually not (but sometimes are) the code metrics.
The bottleneck at larger orgs is mostly always decision-making.
Getting code written and reviewed is the trivial part of the job in most cases, discovering the product needs, considering/uncovering edge-cases, defining business logic that is extensible or easily modifiable when conditions change, etc. are the parts that consume 80% of my time.
We in the engineering org at the company I work for have raised this flag many times during adoption of AI-assisting tools, now that the rollout is deeply in progress with most developers using the tools, changing workflows, it has become the sore thumb sticking out: yes, we can deliver more code if it's needed but for what exactly do you need it?
So far I haven't seen a speed up in decision-making, the same chain of approvals, prioritisation, definitions chugs along as it was and it is clearly the bottleneck.
i dont think thats actually the bottleneck?
the bottleneck is aligning people on what the right thing to do is, and fiting the change into everyone's mental models. it gets worse the more people are involved
> Take a look at openclaw, the code base is an absolute disaster. 500k LoC for something that can be accomplished in 10k.
Mission accomplished: acquhire worth probably millions and millions.
I agree with you, by the way.
It was a hire not an acquihire. There was no acquisition.
There was a big payoff on signing so to-may-to, to-mah-to.
I'm sorry but consider how many more edge cases and alternatives can be handled in 500k LoC as compared to that tiny 10k.
In the days of AGI, higher LoC is better. It just means the code is more robust, more adaptable, better suited to real world conditions.
That’s… not how software works, no matter how it is produced. Complexity is the enemy; always.
I imagine to get more robust code your agent can replace for loop with long lines of if-then statements. Later manager can brag about how many lines of code they created!
In high-performance teams it is. In bike-shedding environments of course it is not.
I'm not sure I'd call it bike shedding so much as that a lot of time and effort tends to go into hard to answer questions: what to build, why to build it, figuring out the target customer, etc. A lot of times going a thousand miles per hour with an LLM just means you figure out pretty quickly you're building the wrong thing. There's a lot of value to that (although we used to just call this "prototyping"), but, that doesn't remove the work of actually figuring out what your product is.
The least productive teams I've been on, it wasn't usually engineering talent that was the problem, it was extremely vague or confused requirements.
I think you meant to say incompetent leadership.
This. The key bottleneck in many organizations is the "socialize and align" on what to build. Or just "socialize and align" in general. :)
one thing that aways slowed me down was writing jsdocs and testing.
Now i can write one example of a pass and then get codex to read the code and write a test for all the branches in that section saves time as it can type a lot faster than i can and its mostly copying the example i already have but changing the input to hit all the branches.
> let's have LLMs check our code for correctness
Lmao. Rofl even.
(Testing is the one thing you would never outsource to AI.)
Outsourcing testing to AI makes perfect sense if you assume that tests exist out of an obligation to meet some code coverage requirements, rather than to ensure correctness. Often I'll write a module and a few tests that cover its functionality, only for CI to complain that line coverage has decreased and reject my merge! AI to the rescue! A perfect job for a bullshit generator.
outsourcing testing the AI also gets its code to be connected to deterministic results, and show let the agent interact with the code to speculate expectations and check them against the actual code.
it could still speculate wrong things, but it wont speculate that the code is supposed to crash on the first line of code
> Testing is the one thing you would never outsource to AI
That's not really true.
Making the AI write the code, the test, and the review of itself within the same session is YOLO.
There's a ton of scaffolding in testing that can be easily automated.
When I ask the AI to test, I typically provide a lot of equivalence classes.
And the AI still surprises me with finding more.
On the other hand, it's equally excellent at saying "it tested", and when you look at the tests, they can be extremely shallow. Or they can be fairly many unit tests of certain parts of the code, but when you run the whole program, it just breaks.
The most valuable testing when programming with AI (generated by AI, or otherwise) are near-realistic integration tests. That's true for human programmers, but we take for granted that casual use of the program we make as we develop it constitutes as a poor man's test. When people who generally don't write tests start using AI, there's just nothing but fingers crossed.
I'd rather say: If there's one thing you would never outsource to AI, it's final QA.
> (Testing is the one thing you would never outsource to AI.)
I would rephrase that as "all LLMs, no matter how many you use, are only as good as one single pair of eyes".
If you're a one-person team and have no capital to spend on a proper test team, set the AI at it. If you're a megacorp with 10k full time QA testers, the AI probably isn't going to catch anything novel that the rest of them didn't, but it's cheap enough you can have it work through everything to make sure you have, actually, worked through everything.
You don't use the LLM to check your code for correctness; you use the LLM to generate tests to exercise code paths, and verify that they do exercise those code paths.
And that test will check the code paths are run.
That doesn't tell you that the code is correct. It tells you that the branching code can reach all the branches. That isn't very useful.