Hacker News

sothatsit 2 days ago [ - ]

This “short leash” seems like more of a crutch to me, and a sign of not giving the AI enough detail on the problem to begin with, or not reviewing and iterating on its output.

Hand-holding great models like Fable through implementation is a waste of time, and a waste of Fable. You can have increasingly nuanced discussions with stronger models, and they write a lot better code than they used to. The process of discussing designs and their implementations, questioning things that look weird to you, and actually reading the AI’s responses also helps to find better solutions.

For example, one time I wanted to write a greedy solver for a problem, and in my discussion with Opus on the idea it suggested using an existing MILP library to solve the problem exactly. I’d never even heard of MILP, but my final implementation ended up being better and simpler than what I’d have done alone.

sixtram 2 days ago [ - ]

You say you can have increasingly nuanced discussions with stronger models.

What I say is, when I asked Claude why he applied a certain change I didn't understand, and boy, it was a small change, he said he "reasoned from first principles" based on the code paths. But it didn't work, and when I asked, "Okay, describe the steps of your reasoning from first principles," it literally answered that it had just made it up.

So, nuanced discussions with models, I don't buy it.

doctoboggan 2 days ago [ - ]

You can never ask why a model did a certain thing, or what it was "thinking" when it said something - just like you can't ask a human which neurons were firing when they had a certain thought. The information just isn't available at that level.

You absolutely can have deep nuanced discussions with LLMs however, you just need to better understand their strengths and weaknesses.

Shitty-kitty 2 days ago [ - ]

A human won't respond with "Neuron 10-100 of the frontal cortex" (jokes aside) with deceptively convincing confidence.

youdont 2 days ago [ - ]

The human will quite convincingly be able to construct a post-hoc reasoning on an action that may or may not be related at all to what was actually going through their head or the actual instinctual reasons that led to a decision.

Jensson 2 days ago [ - ]

Humans can accurately retell what their consciousness was doing, but they have no clue why their unconsciousness responded as it did.

LLM is just that unconsciousness part that humans have to post hoc explain like that, and lacks the conscious part that we humans actually can inspect in ourselves.

If the AI had some introspection part where it actually tracks its reasoning maybe it would be closer to conscious humans. Its too expensive to do that everywhere ofc, not even us humans tracks everything like that, just a tiny bit, but tracking that tiny bit is enough for so much error correction to happen.

stefanfisk a day ago [ - ]

"Humans can accurately retell what their consciousness was doing" is often not true, because of complex mechanisms. The feeling of shame alone can make it very hard for someone to accurately describe how the arrived at the wrong conclusion.

spwa4 a day ago [ - ]

Plus it's an open question if this is even a thing. Does consciousness consist of constructing actions beforehand, or of construction justifications afterward?

Frankly, my opinion is that DNA is incredible at choose the most energy efficient/cheap option, and the cheaper option is definitely justifications afterward.

I feel strengthened by psychological experiments where people are shown fake events involving them, where they then "explain their (nonexistent) reasoning at the time".

Arguments for the idea that the human consciousness/soul is something that is emergent keep getting shouted down though. Even though if you take the extreme opposite: it's obviously wrong. Nobody has ever cut open a human skull (or anything else) and found a soul. So somehow it's constructed from very non-conscious components we don't understand, it's not "actually there" in a real sense.

FeepingCreature a day ago [ - ]

Sufficiently constrained post-hoc justifications are indistinguishable from explanations. Consciousness tries to make things up, it learns that people notice this, it then begins trying to construct justifications that won't be predictably called out as false. Eventually it learns how its unconscious operates, and how to interrogate it, and its post-hoc justifications, at least in the common cases, become reliable.

Dilettante_ a day ago [ - ]

>Consciousness tries to make things up, it learns that people notice this, it then begins trying to construct justifications that won't be predictably called out as false.

There's a logical "skip" between that and

>Eventually it learns how its unconscious operates, and how to interrogate it, and its post-hoc justifications, at least in the common cases, become reliable.

The brain constructs a narrative that won't be called out as false, one that provides social capital, makes one feel good about oneself, is consistent with all your other justifications, etc. It's only an assumption that this process would naturally converge on Truth, and considering it's massively-multiplayer chaos where brains coordinate their stories in complex ways, my assumption is that this would converge on *stability*, not truth.

FeepingCreature a day ago [ - ]

Yep. It converges on truth unless there's a strong reward for lies because truth is easy. It's a neural network. It just reads off/probes the internal state because that's the cheapest way to model the unconscious. The justification won't necessarily be true, mind, in terms of the labels it puts, but it should mostly be true structurally- behaviorally predictive in the ordinary domain.

(Even if you are incentivized to lie and flatter yourself, it is still helpful to have access to the true signal internally, because that way you can know how to structure your lie to best avoid detection.)

pixl97 a day ago [ - ]

>Eventually it learns how its unconscious operates

I mean, no we don't, both in a personal way and in a global scientific understanding.

What you're saying happens is a set of socially consistent and acceptable responses based upon general human knowledge at the time. The common cases aren't exactly reliable, it's that they are repeatable in the sense they cover what we expect, and tend to explode when the world is less predictable.

This is why the scientific method changed the world, because we started writing shit down, comparing notes, and striving for repeatability.

lambdaone a day ago [ - ]

I think a better way of putting this is that humans think they can accurately re-tell what their consciousness was doing. Whether they actually can, or even if consciousness exists at all as a thing outside the perception of consciousness is a philosophical question currently beyond answering.

kzrdude a day ago [ - ]

I wonder if monte carlo tree search could play a role in reasoning. I'm searching and it seems to come up in arxiv papers, so the idea is not dead. I'll look more into this after writing this comment..

naasking a day ago [ - ]

> Humans can accurately retell what their consciousness was doing

Can they? How could we possibly know this is the case? People could simply post-hoc rationalize this to justify whatever decision they made.

Shitty-kitty 2 days ago [ - ]

That's exactly what the LLM seems to have done as well. The problem is that we want and even expect the A.I to be truthful.

sn0n 2 days ago [ - ]

Isn’t that part of what the think blocks are for? Yea, don’t inject them back into the context, but do log them for review of that train of thought… no?

NitpickLawyer 2 days ago [ - ]

You don't get access to the thinking traces. Might work with local models tho, but the current <thinking/> meta isn't particularly suited for this either, as it's a big blob of rambling surfaced by RL, with the "only" objective being that the thinking blob somehow leads to a better final answer. Something more detailed, using templates akin to oAI's harmony could work, provided there's also a step that teaches the models to reflect on the various thinking channels, and maybe surface bits and pieces to include in "skills" or "learnings".

atq2119 a day ago [ - ]

That's true, but it does mean that the LLM itself actually does have access to those thinking traces and could therefore, at least in principle, answer what it was thinking. They're probably not trained to do that, though.

NitpickLawyer a day ago [ - ]

It depends. Up until recently the models were trained only to "think" on the last user message. So you'd send the message1, got back reply1 w/ think1 but you'd make the next iteration m1 - r1 - m2, and would get back reply2 w/ think2. You would not add the thinking1. That's how the models were trained, and that's how you were supposed to construct the conversation.

Now recently some things have changed, and you can add the thinking part (you get that encrypted from the closed API labs). But the model needs to have been trained for this to work. And doing it this way you'll burn through tokens faster, as the thinking parts are usually rather long.

overgard a day ago [ - ]

You certainly can ask it what it was thinking, the problem is just that it's more likely to make up a plausible sounding fabrication than to say "I don't know" or "my reasoning is hidden for business reasons" (frontier models hide a lot of their chain of thought). Which is the fundamental problem with LLMs though, if the data doesn't exist or it's sparse they make things up.

muellero a day ago [ - ]

Choosing plausible sounding fabrication over an admission of ignorance is not an uncommon modality among the human beings I interact with, so I'm not surprised this pattern is found in models trained on human interactions.

OtomotO a day ago [ - ]

Totally fine. Then let's just not pretend these "AI"s are somehow better at it.

That's the whole problem with all of these discussions. It's whataboutism and "You're holding it wrong" allegations.

sixtram 2 days ago [ - ]

So you're saying I can absolutely have a deep, nuanced discussion with an LLM, as long as I don't ask how he arrived at his conclusions?

Jensson a day ago [ - ]

You can also have a deep nuanced discussion with a rubber duck as long as you don't ask any questions it needs to respond to.

user43928 2 days ago [ - ]

Have you not seen all the posts with claims that AI lies about its reasoning when asked to explain how it arrived at the output?

I would instead ask the model to explain how X works, whether it achieves Y, and why we cannot do Z instead.

That is how you have a discussion with the AI.

sothatsit 2 days ago [ - ]

You can have a nuanced discussion with an LLM. But LLMs also have failure modes where they start making up justifications. The two are not mutually exclusive.

pixl97 a day ago [ - ]

>as long as I don't ask how he arrived at his conclusions?

So just the average US political discussion with a human then?

a day ago [ - ]

[deleted]

afzalive a day ago [ - ]

> You can never ask why a model did a certain thing

Of course you can! It might be following outdated docs or read something in legacy code and tried to follow that pattern and it'll tell you as much if you ask it in a way that actually gets you the reason instead of it thinking it needs to immediately fix the mistake.

loose-cannon a day ago [ - ]

Dude, these two things are not at all analogous:

1. Asking a model why it did a certain thing, and

2. Expecting a human to say which neuron fired in their response.

lambdaone a day ago [ - ]

Even asking a human being why they did a certain thing is questionable. The research on choice blindness seems like a pretty definitive debunking of post-hoc rationalization:

https://en.wikipedia.org/wiki/Introspection_illusion#Choice_...

loose-cannon a day ago [ - ]

I'm not sure what point you're trying to make. In science and engineering, being able to provide justification is a core skill. The comparison we should be making is against the human practitioners who are trained in their fields. There will always be a distribution of ability. Saying that there's evidence that people are capable of providing post-hoc rationalization doesn't say anything about the ability of experts to produce well thought out responses (in their respective fields) that don't immediately fall apart under scrutiny.

lambdaone a day ago [ - ]

Structured thinking and deliberation are indeed important, but you can also make LLMs do structured "thinking" if you work hard enough, and generate quite plausible reasoned arguments with valid real-world results, and you can get them to write down their working as they go. But as research has shown, it's not "true" thinking, just pattern matching at a higher level, and eventually runs out of steam.[0]

But you only have to drill down a couple more layers and you are back in the void again; do you have any proof that your own thinking, no matter how structured and accurate, is anything other than pattern-matching at a sufficiently much higher level at which you are incapable of seeing it as such?

I think we will be finding some very interesting things out soon using the combination of LLMs and theorem provers, as demonstrated by Terence Tao's recent work.[1]

A cheetah is not a motorbike is not an aircraft is not a rocket.

[0] https://arxiv.org/abs/2506.06941

[1] https://arxiv.org/abs/2603.12744

semilin 2 days ago [ - ]

"Nuanced discussion" doesn't necessarily mean the sort one would have with a human. Statistical apologies are never going to be meaningful. One could edit nonsense into the context window and the model would attempt to rationalize it. The models are smart but you need to use them in a way that makes sense for what they are.

sothatsit 2 days ago [ - ]

"Nuanced discussions" is more about describing a design to a model, asking the model to critique your design and ask you for clarifications, and then you providing those clarifications and the model "getting it" and proceeding to additional levels of detail before implementation. In particular the models being able to highlight concerns you have not yet thought about is a pretty good sign of this. Fable is noticeably better at this compared to Opus.

I was not talking about models making mistakes. Mistakes, and then models making up justifications for those mistakes, is a failure mode of any LLM, and Fable is no different in that regard. Newer models might make less mistakes, or at least make less egregious mistakes, but they still make mistakes.

solenoid0937 2 days ago [ - ]

Posts like this are meaningless without more context - the model you're using, the harness, the initial prompt and context.

Fable is better than most staff engineers at my FAANG.

sn0n 2 days ago [ - ]

Maybe I’m missing something, but he talks about charm and tasks (repos on his GitHub). Charm being his harness, and tasks being one of his skills. Idk, maybe I’m mistaken from reading the article…

https://github.com/taoeffect

maccard a day ago [ - ]

> Fable is better than most staff engineers at my FAANG.

While this wouldn’t entirely surprise me, my experience is just not that. Using Claude and fable, it regularly (poorly) recreates features that exist inside our codebase. Sure, I could give way more initial context but at a certain point I’ve given so much context that I would have been faster writing the code myself, or I could have literally handed it to even a fresh graduate to write.

Toutouxc 17 hours ago [ - ]

> Fable is better than most staff engineers at my FAANG.

That’s genuinely disturbing.

hrmon 2 days ago [ - ]

But staff engineers take "responsibility"

OtomotO a day ago [ - ]

Including you?

well_ackshually a day ago [ - ]

Fable will definitely be the one on call when it inevitably breaks down from the pile of shit slop it wrote at 5AM, don't worry <3

solenoid0937 a day ago [ - ]

We already use AI for oncall and it works better than our humans most of the time.

dolebirchwood 2 days ago [ - ]

> he

cpursley a day ago [ - ]

[flagged]

atq2119 a day ago [ - ]

We can point out mistakes that feel rather grating without assuming intent behind them.

I agree that their use of "he" is likely because they're not a native speaker, especially because they're arguing against the capabilities of LLMs.

That doesn't make it inherently wrong to point out the mistake when it's so intertwined with the deeper discussion here, especially given the fact that some (hopefully few) people do build relationships with LLMs.

weakfish a day ago [ - ]

> turd bucket autist

I’d be more willing to engage with your argument in good faith without inflammatory language like this. Try and meet people where they are and these conversations become easier.

a day ago [ - ]

[deleted]

cpursley a day ago [ - ]

[flagged]

weakfish a day ago [ - ]

I’d prefer kindness and good faith when talking to strangers, but maybe my expectations are too high.

Do you think you’ll change someone’s mind by being an asshole? Rarely works.

cpursley 9 hours ago [ - ]

You’re in agreement with me, call these toxic language police jerks out as soon they have keyboard spasms.

15 hours ago [ - ]

[deleted]

recroad 2 days ago [ - ]

That may be true but it’s still capable of nuanced discussions.

densekernel 2 days ago [ - ]

I tend to agree,

If you have invested significantly in the planning phase and there is momentum in the architecture and conventions that already exist in the project, the implementation phase might not need as much oversight as is suggested here.

> You can discover that your initial idea was dumb and a better one exists

The planning and architecture phase is usually where I make these types of discovery at a high level.

> Your agent might go “off the rails” and start doing something you don’t want it to do

Candidly these orthogonal, inadvertent edits aren't as bad as they once were and for impactful changes there should be at least some test coverage, even if that test coverage is just "freezing" what was implemented.

As you mentioned the final review discussion is a good chance to verify beyond what review or adversarial review agents find.

visarga 2 days ago [ - ]

I think the obvious solution here is to beef up the test side of the app, much more than when writing code by hand. Tests represent project knowledge in executable format. The LLM does not need to be careful to remember every detail of the tests. You don't need to vet every small interaction, it automates review work as well.

Even better if the project was built from the start to be easier to test and observe. But my golden rule remains - no code without tests, expand test suite all the time.

densekernel a day ago [ - ]

I agree, human-steered, AI-implemented test cases can at least capture the acceptance criteria.

It's then more efficient to inspect if existing test cases are being modified as part of the delivery of something new and inspect why.

RealityVoid 2 days ago [ - ]

I am a bit confused which part you disagree with specifically. Reading AI responses and reviewing code seems to be what you propose as well.

Your example with MLIP is something that would not be prevented by this approach, during the planing phase, it would surface.

I guess the devil is in the details and the way you prompt it for starting the task matters.

But IMO you absolutely need to check the output, need to engage with what the model is doing, need to probe why something is built the way the model tries to build it.

sothatsit 2 days ago [ - ]

I disagree with keeping an eye on the model as it is working, approving every command, and denying and stopping the model when you think it has gone wrong. It is not that it is actively harmful to do this, but rather that it is a waste of time and you can avoid the need for it through better design discussions and review.

Micro-managing and keeping the AI on a "short leash" also lends itself better to telling models to do smaller units of work at a time instead of discussing broader design concerns. That is why I think someone doing this would miss the MILP solution, because they might never discuss the overall design with the model but rather just tell it what to implement next.

RealityVoid 2 days ago [ - ]

I personally am somewhere between you and the author. I don't check _all_ the intermediary steps, but I do try to understand what it's doing [1] and follow the process. Mostly I let it do the changes itself without supervision at each step but when a coherent "chunk" of work is done, I go through it really thoroughly. In almost 90% of the cases after a chunk is done some adjustments are needed.

I find broad architectural design to be _better_ if you follow along in the process because you better understand the direction it's going earlier and you can shift the high level direction much earlier. Even if you check its steps, you can ask it for its take on high-level architectural aspects along the way, no problem. I think personal touch matters a lot though, because I naturally ask it and try to get the big picture image.

[1] I actually find it really instructive what tooling it uses to tackle a problem, I got to become a much better console user because of it

te_chris 2 days ago [ - ]

I agree. Better to let it rip in a sandbox then spend your time correcting the finished product.

Waste of time being in the middle.

fy20 2 days ago [ - ]

The article feels like micromanaging AI. If you think about it like a junior employee, micromanaging them will mean they end up doing the work you want and do it your way. But they won't bring any of their ideas to the table, which in the long run could be beneficial to everyone on the team.

echelon 2 days ago [ - ]

This is the method I use.

It makes sure that I understand everything being generated and that I maintain a firm working knowledge of the codebase at all times.

I can easily steer it too.

2 days ago [ - ]

[deleted]

zquzra 2 days ago [ - ]

Do you have a background in CS or optimization? MILP is a pretty standard concept in algorithms/optimization. So this example doesn't really convince me that the AI reached some unusually superior conclusion. It sounds more like it suggested a well-known technique that you personally hadn't encountered. Useful, yes, but that seems more about background knowledge gaps than about the merits of letting the tool run unconstrained.

sothatsit 11 hours ago [ - ]

There are always concepts that some people think are a basic, that others haven't heard of. The entire benefit here is that AI can point out what we miss. There are certainly techniques you don't know about, or just didn't think to apply to a problem, that others would find to be pretty standard.