That’s the reverse-centaur issue I see: humans are not great at repetitive nuanced similar seeming tasks, putting the onus on humans to retroactively approve high volumes of critical code has them managing a critical failure mode at their weakest and worst. Automated reviews should be enhancing known good-faith code, manual reviews of high volume superficially sound but subversive code is begging for issues over time.

Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.

An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.

I have found ai extreemly good at finding all those really hard bugs though. Ai is a greater force multiplier when there is a complex bug than in gneen field code.

> Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place

The math depends on importance of the software. A mistake in a typical CRUD enterprise app with 100 users has zero impact on anything. You will fix it when you have time, the important thing is that the app was delivered in a week a year ago and was solving some problem ever since. It has already made enormous profit if you compare it with today’s (yesterday’s ?) manual development that would take half a year and cost millions.

A mistake in a nuclear reactor control code would be a total different thing. Whatever time savings you made on coding are irrelevant if it allowed for a critical bug to slip through.

Between the two extremes you thus have a whole spectrum of tasks that either benefit or lose from applying coding with LLMs. And there are also more axes than this low to high failure cost, which also affect the math. For example, even non-important but large app will likely soon degrade into unmanageable state if developed with too little human intervention and you will be forced to start from scratch loosing a lot of time.