Hacker News

The premise is flawed:

  Now that we have software that can write working code ...

While there are other points made which are worth consideration on their own, it is difficult to take this post seriously given the above.

simonw 8 days ago [ - ]

If you haven't seen coding agents produce working code you've not been paying attention for the past 3-12 months.

Eufrat 8 days ago [ - ]

I get the impression there’s a very strong bimodal experience of these tools and I don’t consider that an endorsement of their long-term viability as they are right now. For me, I am genuinely curious why this is. If the tool was so obviously useful and a key part of the future of software engineering, I would expect it to have far more support and adoption. Instead, it feels like it works for selected use cases very well and flounders around in other situations.

This is not an attack on the tech as junk or useless, but rather that it is a useful tech within its limits being promoted as snake oil which can only end in disaster.

simonw 8 days ago [ - ]

My best guess is that the hype around the tooling has given the false impression that it's easy to use - which leads to disappointment when people try it and don't get exactly what they wanted after their first prompt.

Eufrat 8 days ago [ - ]

I think you and a lot of people have spent a lot of energy getting as much out of these models as you can and I think that’s great, but I agree that it’s not what they’re being sold as and there is plenty of space for people to treat these tools more conservatively. The idea that is being paraded around is that you can prompt the AI and the black box will yield a fully compliant, secure and robust product.

Rationality has long since gone out of the window with this and I think that’s sorta the problem. People who don’t understand these tools see them as a way to just get rid of noisome people. The fact that you need to spend a fair amount of money, fiddle with them by cajoling them with AGENTS.md, SKILL.md, FOO.md, etc. and then having enough domain experience to actually know when they’re wrong.

I can see the justification for a small person shop spending the time and energy to give it a try, provided the long-term economics of these models makes them cost-effective and the model is able to be coerced into working well for their specific situation. But we simply do not know and I strongly suspect there’s been too much money dumped into Anthropic and friends for this to be an acceptable answer right now as illustrated by the fact that we are seeing OKRs where people are being forced to answer loaded questions about how AI tooling has improved their work.

AdieuToLogic 8 days ago [ - ]

> If you haven't seen coding agents produce working code you've not been paying attention for the past 3-12 months.

If you believe coding agents produce working code, why was the decision below made?

  Amazon orders 90-day reset after code mishaps cause
  millions of lost orders[0]

0 - https://www.businessinsider.com/amazon-tightens-code-control...

erklik 8 days ago [ - ]

Good journalism would include : https://www.aboutamazon.com/news/company-news/amazon-outage-...

I find it somewhat overblown.

Also, I think there's a difference between working code and exceptionally bug-free code. Humans produce bugs all the time. I know I do at least.

AdieuToLogic 7 days ago [ - ]

> Good journalism would include ...

The link you provided begins with the declaration:

  Written by Amazon Staff

I am not a journalist and even I would question the "good journalism would include" assertion given the source provided.

> I find it somewhat overblown.

As I quoted in a peer comment:

  Dave Treadwell, Amazon's SVP of e-commerce services, told 
  staff on Tuesday that a "trend of incidents" emerged since 
  the third quarter of 2025, including "several major" 
  incidents in the last few weeks, according to an internal 
  document obtained by Business Insider. At least one of 
  those disruptions were tied to Amazon's AI coding assistant 
  Q, while others exposed deeper issues, another internal 
  document explained.
  
  Problems included what he described as "high blast radius 
  changes," where software updates propagated broadly because 
  control planes lacked suitable safeguards. (A control plane 
  guides how data flows across a computer network).

If the above is "overblown", then the SVP has done so. I have no evidence to believe this is the case however.

Do you?

erklik 4 days ago [ - ]

> I am not a journalist and even I would question the "good journalism would include" assertion given the source provided.

You've misunderstood. I was saying good journalism would include both sides, and hopefully primary sources alongside the reporting, so readers can evaluate both.

> If the above is "overblown", then the SVP has done so. I have no evidence to believe this is the case however.

It says "at least one of those disruptions were tied to Amazon's AI coding assistant Q, while others exposed deeper issues." You initially cited this article as evidence that coding agents don't produce working code. But the SVP is describing a broader trend of deployment and control plane failures, most of which are classic infrastructure problems that predate AI tooling entirely. You're attributing a systemic operational failure to AI code generation when even your own source doesn't support that.

More fundamentally, your original argument was that the premise "software can write working code" is flawed. One company having incidents, where some of those incidents involved AI tooling doesn't prove that. Humans cause production incidents every single day. By your logic, the existence of any bug would prove humans can't write working code either.

simonw 8 days ago [ - ]

You appear to be confusing "produce working code" with "exclusively produce working code".

AdieuToLogic 8 days ago [ - ]

> You appear to be confusing "produce working code" with "exclusively produce working code".

The confusion is not mine own. From the article cited:

  Dave Treadwell, Amazon's SVP of e-commerce services, told 
  staff on Tuesday that a "trend of incidents" emerged since 
  the third quarter of 2025, including "several major" 
  incidents in the last few weeks, according to an internal 
  document obtained by Business Insider. At least one of 
  those disruptions were tied to Amazon's AI coding assistant 
  Q, while others exposed deeper issues, another internal 
  document explained.
  
  Problems included what he described as "high blast radius 
  changes," where software updates propagated broadly because 
  control planes lacked suitable safeguards. (A control plane 
  guides how data flows across a computer network).

It appears to me that "Amazon's SVP of e-commerce services" desires producing working code and has identified the ramifications of not producing same.

simonw 8 days ago [ - ]

That's why I'm writing a guide about how to use this stuff to produce good code.

AdieuToLogic 6 days ago [ - ]

> That's why I'm writing a guide about how to use this stuff to produce good code.

Consider the halting problem[0]:

  In computability theory, the halting problem is the problem
  of determining, from a description of an arbitrary computer
  program and an input, whether the program will finish
  running, or continue to run forever. The halting problem is
  undecidable, meaning that no general algorithm exists that
  solves the halting problem for all possible program–input
  pairs.

Essentially, it identifies that mathematics cannot prove an arbitrary program will or will not terminate based on the input given to it. So if math cannot express a solution to this conundrum, how can any mathematical algorithm generate solutions to arbitrary problems which can be trusted to complete (a.k.a. "halt")?

Put another way, we all know "1 + 2 = 3" since elementary school. Basic math assumed everyone knows.

Imagine an environment where "1 + 2" 99% of the time results in "3", but may throw a `DivisionByZeroException`, return NaN[1], or rewrite the equation to be "PI x r x r".

Why would anyone trust that environment to reliably do what they instructed it to do?

0 - https://en.wikipedia.org/wiki/Halting_problem

1 - https://en.wikipedia.org/wiki/NaN

simonw 6 days ago [ - ]

I find the challenge of using LLMs to usefully write software despite their non-deterministic nature to be interesting and deserving of study.

AdieuToLogic 5 days ago [ - ]

I get the appeal and respect the study you are engaging.

A meta-question I posit is; at what point does the investment in trying to get "LLMs to usefully write software despite their non-deterministic nature" become more than solving the problems at hand without using those tools?

For the purpose of the aforementioned, please assume commercial use as opposed to academic research.

8 days ago [ - ]

[deleted]