I didn't make the claim that a model can learn consciousness.

Understanding is not consciousness.

Their training is all about understanding. There is nothing in their architecture or training that credibly optimizes for rich self-awareness.

Given non-persistent experience, non-continuous operation, no ability to build up generalizations and aggregate experience of their own self-awareness over time, they seem to be structurally designed to not have consciousness.

This is a case where acting is very credible. Understanding of other's consciousness, in a functional and third party sense, isn't a substrate for personal experience.

In stark contrast, humans develop consciousness gradually over continuous time with persistent aggregation of experience. By the time we can recognize our own consciousness in the abstract, and reason about it, we have had it for some time.

I use "consciousness" because it's the point of the original argument, but in fact, I think my whole comment still work well if you replace "consciousness" with "understanding".

My point is that the fact that AI can reproduce convincingly human sentence continuation does not imply that the AI has no choice but ending up using a mechanism that "understand" rather than just have learned data patterns that are very effective to fake human sentence continuation but are meaningless in term of understanding the concepts.

And I think that if indeed the only way for AI to reproduce convincingly human sentence continuation would be to end up in a configuration that uses the "understand" mechanism to do so, the behaviour of the first LLM would not show that they are so good at sounding human and yet so bad at failing basic "understanding" tests.

> the fact that AI can reproduce convincingly human sentence continuation does not imply that the AI has no choice but ending up using a mechanism that "understand" rather than just have learned data patterns

Taken as an absolute without any addition context you are right.

But we are not talking about abstractions but specific successful models. The number of parameters models they have may seem large, but they are very small relative to the training data that they have to summarize. That cannot do it without discovering that patterns that make sense out of it.

And we can verify that. Simply discuss completely disparate topics, with some kind of intersection. Converge several highly unlikely topics, there are so many it would take billions of years to exhaust unlikely combinations.

If the model is only interpolating it will produce gibberish.

But that isn't what happens.

The fact that models can be near expert, and sometimes expert, across vast areas of human knowledge is a clue. If they don't understand that, then the question is, why do we think people understand things. Does having an answer mean a human understands something, or is their intuition and stream of conscious reasoning also not understanding? To be even handed about what we mean by understanding.

> That cannot do it without discovering that patterns that make sense out of it.

I don't think it's true at all, and I think we have indication that proves it is false.

We have "basic" LLM, the ones from 2023. They were producing _very convincing_ human text, and yet, they were too often failing basic tests that require understanding.

Now, we have more advanced models, but the counter-example of "basic" LLM demonstrates your assertion is incorrect: these model _did_ produce very convincing human text and yet did not make sense out of it.

But for the more advanced models, the problem is that they are "on top" of basic LLM. So, the first step is a training that build a mechanism that produce convincing text without understanding, and then, the "residuals" are fine-tuned. The result is very unlikely to add "understanding" to the model, because to do so, the whole system needs to deconstruct the basic LLM, to go back towards less efficient situations in order to rebuild almost from scratch. The fact that modern LLM are based on basic LLM means that the first step put the cursor in the bottom of the "basic LLM mechanism" valley, which is a local minimum. And any layer on top of it cannot "climb up" the slope of the valley, pass the ridge and fall into the next valley, even if this next valley has a lower minimum.

> The number of parameters models they have may seem large, but they are very small relative to the training data that they have to summarize.

That is demonstrably an incorrect logic jump. For example, CNN are able to distinguish between pictures of cats and pictures of dogs. The weights in these models are very small relative to the number of pixels they have been trained on. Yet, they distinguish cats and dogs by finding specific shapes in the pictures, without understanding what a 3-D cat and a 3-D dog is.

They have done that without discovering the typical human pattern that make sense of "cat" and "dog". And yet, the number of weights is very very small with respect to the number of pixel used in training.

> And we can verify that. Simply discuss completely disparate topics, ... > If the model is only interpolating it will produce gibberish.

What you are saying is that the model is not simplistic interpolation. But that is a straw man argument: people who say that LLM don't understand don't say LLM are equivalent to simple interpolation machine.

But the problem is that you can have very good predictions in novel situations without understanding.

For example, if you have 10 totally different situations that can be described with a Gaussian curve, and that I show you points for a new situations that cover the left side of a Gaussian curve. Then you will be able to guess that the right side of the curve, which is not an interpolation as it corresponds to situations you never saw, will behave like the rest of the Gaussian curve. And yet, in these 11 situations, I did not even say which real physical phenomenon I'm talking about. You haven't understood anything about these phenomenon, all you have done is guessed that a typical pattern that you have observed somewhere else is more likely to apply here too, without even having to understand anything about the reality of this situation.

And of course, this prediction is "a guess": maybe, for once, in this 11th situation, the curve will start as a Gaussian curve but will suddenly be different. But it happens that the reality is that in this 11th situation, the correct description is a Gaussian curve (because, due to the maths, Gaussian curves are really common). So, when you make your prediction, it looks like you understand the situation, it looks like you understood the physical mechanism that applied here. But it is not the case.

So, no, correctly doing such prediction does not demonstrate understanding.

> The fact that models can be near expert, and sometimes expert, across vast areas of human knowledge is a clue.

That is not at all sufficient. A Chinese room experiment will do that despite the system not understanding Chinese. A pocket calculator will be able to be expert in math computation.

> If they don't understand that, then the question is, why do we think people understand things.

That's the wrong question. The correct question is: we know people understand things, and we see AI behaving similarly to people in some aspect, but is this behaviour _requires_ understanding, or can we reproduce this behaviour without needing to understand?

The fact that "basic" LLM were able to reproduce very convincing text that look like they understood X and yet were demonstrably showing lack of understanding of X demonstrates that we cannot just jump to the conclusion that just because it looks the same, the only possibility is that the core mechanism is identical.

I think most debates about LLMs understanding boil down to different definitions of the word "understand." For example, with the definition of "understand" that I typically use in my daily life, I would argue that in the chinese room, the system as a whole "understands" chinese.

Fair enough, but then, a pocket calculator also understands math, and a pocket translator also understands language. And a wikipedia page that inform you about radioactivity understands nuclear physics. Some will maybe say it is the case, but if we talk about the LLM capabilities as a novelty, then it implies that we are talking about something else, because otherwise, it is not novel at all and it does not make sense to pretend it is.

Thank you for writing this out.

It turns out that the optimal way to highly compress complex information is to understand it.

Sometimes, a problem being hard means you only get bad solutions, or increasingly accurate ones.

The planet isn't big enough for the proverbial interpolative stochastic parrot, over the training set of global human communication.

Two problems with that.

Firstly, how do you know that the optimal way to highly compress complex information is to understand it? You think it is obvious because you are very familiar with "understanding" as a way to summarise complex information. But there can be billions of different ways, outside of human imagination, that is as good or even better.

But secondly, LLM don't find the optimal way, they find the local minimum. Everyone who worked with NN knows that they are prone to come up with spurious pattern, incorrect correlations and bad workaround to guess the correct answer. You regularly need to nudge the NN by creating specifically engineered features to avoid them to fall into the first local minimum.

When it comes to LLM, it is extremely complicated to control to see if the LLM has triggered on a misleading pattern that, by chance, links two "tokens" together, or on a real concept that indeed links two "tokens" together. Basic probability implies that there are probably tons of "fake patterns" engraved into the weight during the LLM training, "fake patterns" that should not exist if there was any kind of "understanding" of the abstract mechanism that links these tokens.

I think Searle's Chinese Room argument refutes this. LLMs are simply manipulating symbols, they do not have semantic understanding. This is why hallucinations exist. And Searle's argument extends even further than LLMs.

You are basically arguing for a functional account of consciousness, but things like this have been debated for literally decades/centuries in philosophy.

Millenia, in fact. The big difference, of course, being that we now have experimental philosophy machines (aka computers). So we can actually put some of these theories to the test, and recognize how utterly inadequate most of the work done on the subject has been. We had a pretty good idea anyway, so it's not a big surprise. Theories of mind have evolved dramatically in the late 20th century. And it's pretty clear that theories of mind will have to be re-done all over again with the advent of LLMs (particularly current-generation LLMs).

The problem with the hallucination argument is (1) that is much less of a problem with good current generation AIs, and (2) living conscious breathing human beings also have a disturbing tendency to make shit up, too. So a tendency to make stuff up doesn't really serve as a disqualifier for consciousness.

Also worth mentioning that the guiding rule of what's philosophical or not is whether it's actually useful. Actually useful philosophy usually becomes something else. Usual some scientific discipline or another. And as it turns out, theories of mind are likely to become extremely useful in the near future. Expect huge advances!

I think one could argue the opposite.

1) Good current generation AIs are specifically trained to reduce hallucinations. If we had new AI system that happened to not have hallucinations as a side effect of their training, then it would be convincing. But here, it looks like we have built a pocket calculator that answer 7+13 = 14, and on top of it, we added a layer that says "if the input is 7+13, then replace the output by 20". This pocket calculator still does not know how to calculate, we just added a layer to hide its mistakes.

2) Not only "make shit up" is not the same as "hallucination" (either "making shit it" is done when the individual knows it is unreliable, or when the individual was given wrong inputs), but the point is not to say "hallucination implies no consciousness", but "large quantities of hallucinations in situations where a conscious system would be unlikely to hallucinate implies no consciousness"

>LLMs are simply manipulating symbols, they do not have semantic understanding.

falsify this. Show me a way you'd be able to prove they do/don't, that would work for humans.

Searle's Chinese Room argument is wrong.

This is not helpful. In what way is it wrong? Does the person in the room know Chinese?

It is a helpful pointer for people who might otherwise assume that a well-known argument by a famous philosopher is sound without checking too deeply. Straightforward refutations can be found on wikipedia or by thinking about it.

That just isn't true, there are no straightforward refutations of the Chinese Room that are widely accepted. Philosophers disagree about it. It's highly controversial and pretending that it's decided one way or another is not a helpful pointer for anyone.

>That just isn't true, there are no straightforward refutations of the Chinese Room that are widely accepted.

Yes there is, the systems reply is the obvious and correct answer. Philosophers that disagree are simply wrong. In the end what matters is what's true or false, not how many philosophers accept something. You can check for yourself by reading the argument, following its reasoning, and seeing that it is false; and reading the systems reply, following its reasoning, and seeing that it's true (https://plato.stanford.edu/entries/chinese-room/#SystRepl). The case is similar to those mathematical or logical proofs for the existence of god, where obviously fallacious reasoning gets a pass because it confirms deeply held beliefs.

edit: by the way as to your assertion that the argument is controversial and there is no consensus, I just found something funny on wikipedia (https://en.wikipedia.org/wiki/Chinese_room#History):

>Most of the discussion consists of attempts to refute it. "The overwhelming majority", notes Behavioral and Brain Sciences editor Stevan Harnad,[f] "still think that the Chinese Room Argument is dead wrong".[13] The sheer volume of the literature that has grown up around it inspired Pat Hayes to comment that the field of cognitive science ought to be redefined as "the ongoing research program of showing Searle's Chinese Room Argument to be false".[14]

What you are referring to is Searle assertion that "because the Chinese room concept, I conclude that every future human-made systems will be a Chinese room and will never be 'intelligent'".

I think it is an important nuance.

You have to be careful when saying "Searle Chinese room" is dead wrong: the Chinese room concept in itself is useful and not controversial, and it is possible that current LLM are "Chinese rooms", and therefore not 'intelligent'.

We could use the "Chinese room" term to denote a system that superficially mimicks human speech, but breaks down at some point and/or uses different mechanisms such that it doesn't result in consciousness. But I don't think that was the intent of the argument and it's not how the argument is generally understood in the literature, so it would just be confusing IMO.

(And you still seem to be implicitly accepting that the basic argument is valid, which would be wrong.)

> You can check for yourself by reading the argument, following its reasoning, and seeing that it is false; and reading the systems reply, following its reasoning, and seeing that it's true

You are being tedious. I obviously have done this and I disagree with you. Saying that X is logically true and Y is logically false is not a demonstration of those baseless assertions. This is not helpful, what you're saying isn't true, and what I'm saying is backed up by the wikipedia article. The bit you quote is simply stating that most literature about the Chinese Room is an attempt to refute it, which is obvious, because the people who are convinced see no need to publish saying so. The fact that people keep publishing means that they have not yet succeeded in refuting it.

Or I can simply say this: you've made a mistake in your logic. Actually, the Chinese Room argument is correct. Since you won't explicate your logic, neither will I.

Have a good day.

Ok, I think it's clear where we both stand, a good day to you too :)

I’m also fixated on the term “experience” in the context of this debate. To me, consciousness is something that one “experiences”, and the two concepts are intertwined.

I am far from convinced that the training and inference regimes of LLMs would qualify as “experience” by any sense of the word.

Now, if we hooked up a plethora of audiovisual and tactile sensors with live feedback directly to a neural network rich with transformers, that was always powered on and fully autonomous, we may be getting there. But we’d probably also be on the verge of manmade horrors beyond our comprehension.

Biological rodent neural networks in a Petri dish stimulated by electrical impulses - more or less conscious than LLMs?

Human on life support, unable to respond to any external stimuli, “braindead” - more or less conscious than LLMs?

I point of sorts. Assuming that is true (I don't think it is), the big question that urgently needs to be addressed is what happens when we DO give LLMs tools to interact with the real (or virtual) world. And people are doing that, right now, in both real and virtual worlds. And people ARE giving LLMs the ability to run continuously for long periods of time, sometimes with enormous context buffers. People ARE putting LLMs into robots with front-end ML and LLM systems for visual processing, and back-end ML systems for autonomous control.

And, yes, concerns about whether biological rodent neural networks are or are not conscious come up frequently in the biological neural network papers. I'm not sure I would want to be a researcher trying to get an experiment past an ethics committee if my biological neural network had 25B rat neurons. (I would hope that they could not).