Humans can accurately retell what their consciousness was doing, but they have no clue why their unconsciousness responded as it did.

LLM is just that unconsciousness part that humans have to post hoc explain like that, and lacks the conscious part that we humans actually can inspect in ourselves.

If the AI had some introspection part where it actually tracks its reasoning maybe it would be closer to conscious humans. Its too expensive to do that everywhere ofc, not even us humans tracks everything like that, just a tiny bit, but tracking that tiny bit is enough for so much error correction to happen.

"Humans can accurately retell what their consciousness was doing" is often not true, because of complex mechanisms. The feeling of shame alone can make it very hard for someone to accurately describe how the arrived at the wrong conclusion.

Plus it's an open question if this is even a thing. Does consciousness consist of constructing actions beforehand, or of construction justifications afterward?

Frankly, my opinion is that DNA is incredible at choose the most energy efficient/cheap option, and the cheaper option is definitely justifications afterward.

I feel strengthened by psychological experiments where people are shown fake events involving them, where they then "explain their (nonexistent) reasoning at the time".

Arguments for the idea that the human consciousness/soul is something that is emergent keep getting shouted down though. Even though if you take the extreme opposite: it's obviously wrong. Nobody has ever cut open a human skull (or anything else) and found a soul. So somehow it's constructed from very non-conscious components we don't understand, it's not "actually there" in a real sense.

Sufficiently constrained post-hoc justifications are indistinguishable from explanations. Consciousness tries to make things up, it learns that people notice this, it then begins trying to construct justifications that won't be predictably called out as false. Eventually it learns how its unconscious operates, and how to interrogate it, and its post-hoc justifications, at least in the common cases, become reliable.

>Consciousness tries to make things up, it learns that people notice this, it then begins trying to construct justifications that won't be predictably called out as false.

There's a logical "skip" between that and

>Eventually it learns how its unconscious operates, and how to interrogate it, and its post-hoc justifications, at least in the common cases, become reliable.

The brain constructs a narrative that won't be called out as false, one that provides social capital, makes one feel good about oneself, is consistent with all your other justifications, etc. It's only an assumption that this process would naturally converge on Truth, and considering it's massively-multiplayer chaos where brains coordinate their stories in complex ways, my assumption is that this would converge on *stability*, not truth.

Yep. It converges on truth unless there's a strong reward for lies because truth is easy. It's a neural network. It just reads off/probes the internal state because that's the cheapest way to model the unconscious. The justification won't necessarily be true, mind, in terms of the labels it puts, but it should mostly be true structurally- behaviorally predictive in the ordinary domain.

(Even if you are incentivized to lie and flatter yourself, it is still helpful to have access to the true signal internally, because that way you can know how to structure your lie to best avoid detection.)

>Eventually it learns how its unconscious operates

I mean, no we don't, both in a personal way and in a global scientific understanding.

What you're saying happens is a set of socially consistent and acceptable responses based upon general human knowledge at the time. The common cases aren't exactly reliable, it's that they are repeatable in the sense they cover what we expect, and tend to explode when the world is less predictable.

This is why the scientific method changed the world, because we started writing shit down, comparing notes, and striving for repeatability.

I think a better way of putting this is that humans think they can accurately re-tell what their consciousness was doing. Whether they actually can, or even if consciousness exists at all as a thing outside the perception of consciousness is a philosophical question currently beyond answering.

I wonder if monte carlo tree search could play a role in reasoning. I'm searching and it seems to come up in arxiv papers, so the idea is not dead. I'll look more into this after writing this comment..

> Humans can accurately retell what their consciousness was doing

Can they? How could we possibly know this is the case? People could simply post-hoc rationalize this to justify whatever decision they made.