Hacker News

Assume LLMs have conscious experiences. Take a session with an LLM. A prompt is fed to the LLM. It generates some text. Another input is fed in, comprising the previous prompt, the generated text and a new prompt. The model generates some more text. This continues for a while and the session concludes.

Some questions:

1. Let's say we perform the exact same experiment, running the same program on the same computer with the same inputs and the same random seed. The same outputs are produced. The session is byte for byte identical in all the inputs, outputs and internal states. Is the conscious experience of the LLM here the same? If so, in what sense is it the same? Is it a similarity of two separate experiences or is it the same actual experience?

2. Now let's say the program that runs this LLM is rewritten from scratch and run on a different machine. The software and hardware are different but the weights are the same and all the inference calculations produce identical numbers. Is the conscious experience the same? In which sense?

3. Now say the weights are changed but the tokens generated for this particular session don't change. Same conscious experience?

4. Lastly, consider the original experiment. Did the LLM have a conscious experience corresponding to that first prompt and its response? Was that distinct from its conscious experience of the second prompt? Was the first experience then re-experienced every time the first prompt was fed back in as part of the later prompting steps? If so, what about the text of its own that it previously generated and is now fed back into it. Does this generate a conscious experience of its own?

And a further question - a dichotomy:

A. If the answer to 1 above is that the conscious experience is the same in the true identity sense - i.e. only one conscious experience is had, not a separate one in each run, does that imply that the conscious experience exists independently of any particular realisation of this experiment? If running this experiment N times results in exactly 1 conscious experience, is that still true if N=0?

B. On the other hand, if the two experiences are distinct (however similar they may be), how does that fit with the answer to question 4? A single consciousness experiencing the whole conversation in question 4 would seem at odds with the conscious experiences in question 1 being distinct, so doesn't this imply there is no conscious experience of the whole "conversation", but rather a separate conscious experience of each round of feed-all-the-prompts-and-outputs-back-in?

My own response to all of the above is "mu" - unask the question. It is ill-posed, sound-of-one-hand-clapping stuff. I think the questions assume properties that conscious experience simply doesn't have (particularly, the ability to perfectly reproduce the circumstances in which they arise), and that the questions simply don't make any sense in relation to actual conscious experience.

However, that way of thinking follows from a particular world view that many here don't share. I'm curious what thoughts people who take seriously the idea of LLM (or algorithmic, in general) consciousness have on the above questions.

I appreciate actually attempting to engage with the concept skeptically rather than assuming an answer in either direction.

As someone who's fairly neutral on the answer, I'd speculate:

1. Consciousness seems to me to be the awareness of the processing, not the processing itself. So running a program does not produce consciousness as there's no awareness, no feedback loop. Now a program that has self monitoring, alerting coupled to error handling that triggers the program to behave differently going forward, I would say has a minimal consciousness, and it's not a single experience, it's multiple equivalent experiences. If there were any differences at all we'd say it's obviously not one experience, so I don't see how the elimination of the last difference in the processing collapses all those separate experiences into a single one.

More specifically to this discussion, with LLMs we can often see them using a word, and through autoregression the following words reveal that the original word choice is now realized to be incorrect, and it attempts to correct midstream. You can also see them responding to error messages from the environment or criticism from the users but as external feedback these have a weaker claim to evidence of consciousness imo.

2. I'd say the experience is equivalent, "same" has slightly different semantics but perhaps even that fits. This thought experiment is equivalent to asking, what if we took a snapshot of your brain as weights, and then ran the calculations on a GPU rather than your neurons, and you to all external indications, GPU you acted the same as brain you. Would that change of venue eliminate the consciousness of your thoughts?

3. Since I am defining consciousness as awareness of the processing, this depends where the awareness is located. The LLM would have very different internal states in between the input and output tokens, but if its self awareness was only based on its own output the experience would be equivalent. If it was deeper such that it had ongoing awareness of the embeddings and so on, the experience would change.

4. This is where it gets really interesting, because it starts to get more into a nonbinary idea of consciousness. I think it's clear from questions like this that if LLMs have consciousness, it's nothing like ours. If there is an experience of being an LLM, it's nothing like being a human, even if conversationally we can communicate with each other rather easily. There are further twists to your questions, like does the experience change if the KV is cached or not? Without trying to micro-analyze the issue I will continue to point back to my foundation of: where is the self awareness located, and how does feedback occur.

A. Negated, I don't agree with your answer to 1. B. Agreed, I would speculate that if there is an LLM conscious experience, it is fragmented unlike ours. But I would disagree that this necessarily disproves the concept of LLM consciousness.