Hacker News

pron 14 hours ago [ - ]

> If you’re more or less experienced, you can easily see the “good” and “bad” sides of it. So you kinda plan it out in a way that you can “evolve AI generated software”.

If you're truly "managing fleets of agents" there's no way you're able to sift through the good and the bad in the output. If your AI-generated code is evolvable (which is hard to tell right now) then you're not writing it with "fleets of agents". If you are writing it with fleets of agents, I would bet it's not evolvable; you just haven't reached the breaking point yet.

tokioyoyo 5 hours ago [ - ]

We’re not managing fleets of agents. They’re not productive for our workflows yet. It’s usually a couple of CC CLIs running and going back and forth on specific tasks we closely control.

pron an hour ago [ - ]

They're not productive for any workflow is my point because they don't produce sustainable software, yet that's exactly what Armstrong is calling for. They don't work, and people experienced with AI workflows already know that.

If you review the code and tell the agent to revert when it gets things wrong (not functionally but architecturally) you're fine. That's not what I was responding to.

snapcaster 41 minutes ago [ - ]

You're just wrong on this though, and I don't know why you aren't realizing it's a skill issue on your part

pron 29 minutes ago [ - ]

Nah, it's a skill issue on the part of those who believe in "agent swarms" (in fact, that's how I recognise AI noobs; they think swarms work). Studies (like this [1]) and Anthropic's experiements have told us they don't. We do experiments with software correctness and formal methods experts who actually dive deep into "swarm outputs" and try to put evolutionary pressure on them. Swarms simply cannot (yet) produce viable software. They do, however, produce software that for a while passes tests. What I think is happening is that people who believe swarms work just look at test results. But obviously, every software engineer has known for decades that tests can only tell you if your software works today; they can't tell you that it will work tomorrow. And the people who say that unreviewed agent output will work tomorrow are those who didn't review it closely enough, so they have no idea, either.

[1]: https://arxiv.org/abs/2603.03823