Hacker News

vidarh 20 hours ago [ - ]

If you ask humans to explain why we did something, Sperry's split brain experiment gives reason to think you can't trust our accounts of why we did something either (his experiments showed the brain making up justifications for decisions it never made)

Bit it can still be useful, as long as you interpret it as "which stimuli most likely triggered the behaviour?" You can't trust it uncritically, but models do sometimes pinpoint useful things about how they were prompted.

amluto 19 hours ago [ - ]

Humans can do one thing that AI agents are 100% completely incapable of doing: being accountable for their actions.

jumpconc 18 hours ago [ - ]

You haven't met certain humans. Not all humans have internal capacity for accountability.

The real meaning of accountability is that you can fire one if you don't like how they work. Good news! You can fire an AI too.

pessimizer 17 hours ago [ - ]

Bad news! They will not be aware that you have done this and will not care.

Zak 17 hours ago [ - ]

The purpose of firing a person shouldn't be vengeance but to remove someone who is unreliable or not cost effective.

It's similarly reasonable to drop a tool that's unreliable, though I don't think that's a reasonable description here. Instead, they used a tool which is generally known to be unpredictable and failed to sandbox it adequately.

bigstrat2003 16 hours ago [ - ]

The purpose of firing a person is to remove someone unreliable, but also, the person having that skin in the game makes him behave more reliably. The latter is something you cannot do with an LLM.

The cold hard fact is: LLMs are an unreliable tool, and using them without checking their every action is extremely foolish.

lukan 15 hours ago [ - ]

"The cold hard fact is: LLMs are an unreliable tool, and using them without checking their every action is extremely foolish."

You mean checking every action of theirs outside the sandbox I suppose? Otherwise any attempt at letting an agent do some work I would consider foolish.

jumpconc 15 hours ago [ - ]

The AI company has skin in the game which motivates them to produce reliable AIs.

dabinat 9 hours ago [ - ]

Can you actually sue Anthropic over this when they clearly state that AI can make mistakes and you should double-check everything it does?

jumpconc 2 hours ago [ - ]

You can fire Anthropic. Anthropic can decide it's losing too many customers and do something about it.

justinclift 14 hours ago [ - ]

Doesn't seem to be working though. :(

hun3 18 hours ago [ - ]

But it's still a bit more difficult to sue them for leaking your company's data.

At least for now.

grey-area 19 hours ago [ - ]

Don’t forget learning, humans can learn, LLMs do not learn, they are trained before use.

HighGoldstein 4 hours ago [ - ]

Do we? Or are we born with pre-training (all the crucial functions the brain does without us having to learn them) and a context window orders of magnitude larger than an LLM?

compass_copium an hour ago [ - ]

It is incredible how willing and eager AI boosters are to denigrate the incredible miracle of human consciousness to make their chatbots seem so special.

No, we are not born with all the pre-training we need. That is rather the point of education, teaching people's brains how to process information in new, maybe unintuitive ways.

addedGone 17 hours ago [ - ]

They learn on the next update :p

grey-area 9 hours ago [ - ]

That’s training, not learning.

quantummagic 16 hours ago [ - ]

Yup. And eventually there will be online learning, that doesn't require a formal update step. People keep conflating the current implementation, as an inherent feature.

lmm 9 hours ago [ - ]

What does that actually mean in practice? You can yell at human if it makes you feel better, sure, but you can do that with an AI agent too, and it's approximately as productive.

unyttigfjelltol 18 hours ago [ - ]

I disagree. They could fire Claude and their legal counsel could pursue claims (if there were any, idk)-- the accountability model is similar. Anthropic probably promised no particular outcome, but then what employee does?

And in the reverse, if a person makes a series of impulsive, damaging decisions, they probably will not be able to accurately explain why they did it, because neither the brain nor physiology are tuned to permit it.

Seems pretty much the same to me.

yladiz 13 hours ago [ - ]

> They could fire Claude and their legal counsel could pursue claims (if there were any, idk)-- the accountability model is similar.

What do you mean by fire? And how is the accountability similar to an employee?

antonvs 18 hours ago [ - ]

That’s a feature that other humans impose on whoever’s being held accountable. There’s no reason in principle we couldn’t do the same with agents.

LPisGood 18 hours ago [ - ]

How would you fire an agent? This impacts the company that makes the LLM, but not the agent itself.

jeremyccrane 19 hours ago [ - ]

Yep.

jayd16 18 hours ago [ - ]

You might as well be asking a tape recorder why it said something. Why are we confusing the situation with non-nonsensical comparisons?

There is no internal monologue with which to have introspection (beyond what the AI companies choose to hide as a matter of UX or what have you). There is no "I was feeling upset when I said/did that" unless it's in the context.

There is no ghost in the machine that we cannot see before asking.

Even if a model is able to come up with a narrative, it's simply that. Looking at the log and telling you a story.

vidarh 17 hours ago [ - ]

Sperry's experiments makes it quite clear that the comparison is not nonsensical: humans can't reliably tell why we do things either. It is not imbuing AI with anything more to recognise that. Rather pointing out that when we seek to imply the gap is so huge we often overestimate our own abilities.

fluoridation 16 hours ago [ - ]

Humans at least have a mental state that only they are privy to to work from, and not just their words and actions. The LLM literally cannot possibly have a deeper insight into the root cause than the user, because it can only work from the information that the user has access to.

lmm 9 hours ago [ - ]

> Humans at least have a mental state that only they are privy to to work from

Maybe. How do you tell? What would you expect to be different if they didn't?

> The LLM literally cannot possibly have a deeper insight into the root cause than the user, because it can only work from the information that the user has access to.

Insight is not solely a function of available input information. Arguably being able to search and extract the relevant parts is a far more important part of having insights.

fluoridation 3 hours ago [ - ]

>Maybe. How do you tell? What would you expect to be different if they didn't?

I think you're asking how I would know if other people were P-zombies. That's an inappropriate question because I didn't talk about subjective experience, just about internal state. There's no question about whether other people have internal states. I can show someone a piece of information in such a way that only they see it and then ask them to prove that they know it such that I can be certain to an arbitrarily high degree that their report is correct.

Unvoiced thoughts are trickier to prove, but quite often they leave their mark in the person's voiced thoughts.

>Insight is not solely a function of available input information. Arguably being able to search and extract the relevant parts is a far more important part of having insights.

LLMs are notoriously bad at judging relevance. I've noticed quite often if you ask a somewhat vague question they try to cold-read you by throwing various guesses to see which one you latch onto. They're very bad at interpreting novel metaphors, for example.

jayd16 15 hours ago [ - ]

It is non-sensical because you're simply bringing in comparisons without anything linking the two. You might as well be talking about how oranges, and bicycles think as well as that is just as relevant as how humans think in this discussion.

In fact, talking about "thinking" at all is already the wrong direction to go down when trying to triage an incident like this. "Do not anthropomorphize the lawnmower" applies to AI as much as Larry Ellison.

vidarh 8 hours ago [ - ]

The thing linking the two is that neither are able to accurately introspect and explain the actual reason why they made a decision.

If thinking is the wrong direction to go down, then it is also the wrong direction to go down when talking about humans.

jayd16 2 minutes ago [ - ]

If your plane fails to fly and humans can't fly then we should be looking at the musculature of humans when working on the plane?

abcde666777 16 hours ago [ - ]

Slight pushback - I think there's still a lot more consistency and coherence in a human's recollection of their motives than an LLM.

Sometimes I think we're too eager to compare ourselves to them.

vidarh 8 hours ago [ - ]

We have pretty much evidence to support that human recollection includes the right data to be able to ascertain why we actually did something.

tempaccount5050 16 hours ago [ - ]

I think you might be misinterpreting that. I always understood it to mean that when the two hemispheres can't communicate, they'll make things up about their unknowable motivations to basically keep consciousness in a sane state (avoiding a kernel panic?). I don't think it's clear that this happens when both hemispheres are able to communicate properly. At least, I don't think you can imply that this special case is applicable all the time.

vidarh 8 hours ago [ - ]

We have no reason to believe it is a special case. The fact that these patients largely functioned normally when you did not create a situation preventing the hemispheres from synchronising suggests otherwise to me. There's no reason to think the ability to just make things up and treat it as if it is truthful recollection would just disappear because there are two halves that can lie instead of just one.

cmiles74 19 hours ago [ - ]

None of the developers that I’ve worked with have had the hemispheres of their brains severed. I suspect this is pretty rare in the field.

lmm 9 hours ago [ - ]

> None of the developers that I’ve worked with have had the hemispheres of their brains severed.

But are their explanations for how they behaved any more compelling than those of people who have? If so, why?

pixl97 19 hours ago [ - ]

This still doesnt stop post ad hoc explanations by humans.

18 hours ago [ - ]

[deleted]

tempaccount5050 16 hours ago [ - ]

I feel like your conflating a deep misconfiguration of a brain with lying. These things are completely different.

layer8 15 hours ago [ - ]

The thing is, the LLM mostly just states what it did, and doesn't really explain it (other than "I didn't understand what I was doing before doing it. I didn't read Railway's docs on volume behavior across environments."). Humans are able of more introspection, and usually have more awareness of what leads them to do (or fail to do) things.

LLMs are lacking layers of awareness that humans have. I wonder if achieving comparable awareness in LLMs would require significantly more compute, and/or would significantly slow them down.

vidarh 8 hours ago [ - ]

Sperry's experiments suggests we don't have that awareness, but think we do as our brains will make up an explanation on the spot.

pierrekin 19 hours ago [ - ]

I agree that the model can help troubleshoot and debug itself.

I argue that the model has no access to its thoughts at the time.

Split brain experiments notwithstanding I believe that I can remember what my faulty assumptions were when I did something.

If you ask a model “why did you do that” it is literally not the same “brain instance” anymore and it can only create reasons retroactively based on whatever context it recorded (chain of thought for example).

XenophileJKO 19 hours ago [ - ]

Anthropic's introspection experiments have seemed to show that your argument is falsifiable.

https://www.anthropic.com/research/introspection

sumeno 18 hours ago [ - ]

> In fact, most of the time models fail to demonstrate introspection—they’re either unaware of their internal states or unable to report on them coherently.

You got the wrong takeaway from your link.

XenophileJKO 18 hours ago [ - ]

The parent said: "I argue that the model has no access to its thoughts at the time."

This is falsified by that study, showing that on the frontier models generalized introspection does exist. It isn't consistent, but is is provable.

"no access" vs. "limited access"

sumeno 17 hours ago [ - ]

There is no way for a user to know whether the LLM has introspection in a given case or not, and given that the answer is almost always no it is much better for everyone to assume that they do not have introspection.

You cannot trust that the model has introspection so for all intents and purposes for the end user it doesn't.

dwheeler 17 hours ago [ - ]

I would say "limited and unreliable access". What it says is the cause might be the cause, but it's not on any way certain.

fragmede 19 hours ago [ - ]

Claude code and codex both hide the Chain of Thought (CoT) but it's just words inside a set of <thinking> tags </thinking> and the agent within the same session has access to that plaintext.

fc417fc802 19 hours ago [ - ]

Those are just words inside arbitrary tags, they aren't actually thoughts. Think of it as asking the model to role play a human narrating his internal thought process. The exercise improves performance and can aid in human understanding of the final output but it isn't real.

lmm 8 hours ago [ - ]

What would be different if it was "real"? What makes you think that when humans "narrate" "their" "internal thought process", it's any more "real"?

antonvs 18 hours ago [ - ]

Why do you believe that humans have access to an “internal thought process”? I.e. what do you think is different about an agent’s narration of a thought process vs. a human’s?

I suspect you’re making assumptions that don’t hold up to scrutiny.

fc417fc802 18 hours ago [ - ]

I made no such claim and I don't understand what direct relevance you believe the human thought process has to the issue at hand.

You appear to be defaulting to the assumption that LLMs and humans have comparable thought processes. I don't think it's on me to provide evidence to the contrary but rather on you to provide evidence for such a seemingly extraordinary position.

For an example of a difference, consider that inserting arbitrary placeholder tokens into the output stream improves the quality of the final result. I don't know about you but if I simply repeat "banana banana banana" to myself my output quality doesn't magically increase.

DiogenesKynikos 14 hours ago [ - ]

Given that LLMs can speak basically any language and answer almost any arbitrary question much like a human would, the claim that LLMs have comparable (not identical) thought processes to humans does not seem extraordinary at all.

18 hours ago [ - ]

[deleted]

yladiz 13 hours ago [ - ]

Are you legitimately arguing that humans don’t have an internal thought process in some way?

vidarh 8 hours ago [ - ]

They're arguing that we have no evidence that humans have access to our underlying thoughts any more than the models do.

yladiz an hour ago [ - ]

What does that mean though, to “have access to our underlying thoughts”? Humans can obviously mentally do things that are impossible for a language model to do, because it’s trivial to show that humans do not need language to do mental tasks, and this includes things related to thought, so I don’t really get what is being argued in the first place.

jmalicki 19 hours ago [ - ]

It does have access to its thoughts. This is literally what thinking models do. They write out thoughts to a scratch pad (which you can see!) and use that as part of the prompt.

fc417fc802 19 hours ago [ - ]

It's important to be aware that while those "thoughts" can be a useful aid for human understanding they don't seem to reliably reflect what's going on under the hood. There are various academic papers on the matter or you can closely inspect the traces of a more logically oriented question for yourself and spot impossible inconsistencies.

mmoll 19 hours ago [ - ]

It doesn’t mean that these “thoughts” influenced their final decision the way they would in humans. An LLM will tell you a lot of things it “considered” and its final output might still be completely independent of that.

jmalicki 17 hours ago [ - ]

Its output quite literally is not independent, as the "thinking tokens" are attended to by the attention mechanism.

grey-area 19 hours ago [ - ]

They do not in fact do that. The ‘thoughts’ are not a chain of logic.

19 hours ago [ - ]

[deleted]

sumeno 18 hours ago [ - ]

You have a fundamental misunderstanding of what the model is doing. It's not your fault though, you're buying into the advertising of how it works

eleumik 14 hours ago [ - ]

Those are a funny progress bar made by a micro model , is just ui

emp17344 20 hours ago [ - ]

That is absolutely not what the split brain experiment reveals. Why would you take results received from observing the behavior of a highly damaged brain, and use them to predict the behavior of a healthy brain? Stop spreading misinformation.

nuancebydefault 18 hours ago [ - ]

Such 'highly damaged' brain is still 90 percent or more structured the same as a normal human brain. See it as a brain that runs in debug mode.

It is known that the narrative part of the brain is separate from the decision taking brain. If someone asks you, in a very convincing, persuasive way, why you did something a year ago and you can't clearly remember you did, it can happen that you become positive that you did so anyway. And then the mind just hallucinates a reason. That's a trait of brains.

Jensson 13 hours ago [ - ]

> If someone asks you, in a very convincing, persuasive way, why you did something a year ago and you can't clearly remember you did, it can happen that you become positive that you did so anyway. And then the mind just hallucinates a reason. That's a trait of brains.

Yes brains can hallucinate reasons, doesn't mean they always do. If all reasons given were hallucinations then introspection would be impossible, but clearly introspection do help people.

vidarh 17 hours ago [ - ]

Because said "highly damaged brain" in most respects still functions pretty much like a healthy one.

There is no misinformation in what I wrote.