Hacker News

As someone that has already had to "rescue" several of those oft-praised "AI-first" projects you hear all about, I can safely say this: "It's not going super well."

The LLM-produced code I've seen thus far (from moderate feature PRs up to and including entire services), tends twords 'too large' or 'too complex' to properly vet by their small team of creators; PRs against existing codebases are often riddled with minor, seemingly unrelated changes or worse: large and/or subtle test suite alterations that "all pass" but contain hidden assumptions in conflict with reality and your business requirements. Numerous edge cases go entirely missed, overly trivial things are validated instead, etc. Feedback loops with the system certainly improves output but is frustrating and in many cases no faster than just writing the code yourself. At this stage you still need a human in the loop, unless you're still in the earliest stages of building a product. This being HN I'm sure someone around here is employing LLMs successfully in those cases, but the story for established orgs tends to be more complex, especially in traditionally risk averse fields like healthcare, billing, credit card processing, defense, etc. All fields I've worked in.

The sheer amount of code these LLM systems produce in aggregate means that we have a deficit of cognitive spoons across our orgs to properly review and test all these changes that are being pumped out. In other words: more bugs end up getting found in the field instead of earlier because we're allowing our generators to validate themselves due to resource constraints.

Can we produce code faster than ever before? Sure, but that was never the real bottleneck to begin with, at least in the orgs that I've operated within the past 2 decades.

To me, every new line of code needs an inherent justification for its existence. Code is often as much a liability as it is an asset. Think on several axis like risk of security vulnerabilities or cognitive load limits when no one left in your org understands your vibe-coded system anymore, as it grows more unwieldy by the day. The business value of new code being introduced should outway its (not so) hidden cost of maintenance and risk of business disruption. With today's emphasis on using LLM output for "moving faster", we seem to be ignoring that critical risk/reward analysis and operating under the incorrect assumption that more code is universally a good thing.

So for the moment I'm just using LLMs for "rubber ducking" or spitballing/testing out ideas before committing to a course of action; an action that will ultimately get executed by a human in the short term at least.

That said, on a good day my AI can serve as a proxy for a semi-competent software engineer (with amnesia), and no worse than an actual rubber duck on the others.

(note that the above statements are my own personal observations and are not intended to represent or express any statement or opinion held by my employer, etc, etc.)