> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say.
But did any AI researchers actually claim there was no representation of meaning? I thought generally, the criticism of LLMs was that while they do abstract from their corpus - ie, you can regard them as having a representation of "meaning" - it's tightly and inextricably tied to the surface level representation, it isn't grounded in models of the external world, and LLMs have poor ability to transfer that knowledge to other surface encodings.
I don't know who the "certain AI researchers" are supposed to be. But the "stochastic parrot" paper by Bender et al [1] says:
> Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind.
That's a very different objection to the one antirez describes - I think he's erecting a straw man. But I'd be happy to be corrected by anyone more familiar with the research.
> Text generated by an LM is not grounded in communicative intent
This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.
Also their model requires the contrary, too: that the model does not know, semantically, what the query really means.
Stochastic Parrot has a scientific meaning, and just only observing the function of the models, it is quite evident that they were very wrong, but now we have stong evidence (via probing) that also the sentence you quoted is not correct, since the model knows the idea to express also in general terms, and features about things it is going to say much later activates a lot of tokens earlier, including conceptual features that are relevant later in the sentence / concept expressed.
You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models. The stochastic parrot does not understand the query nor is trying to reply to you in any way, it just exploits a probabilistic link among the context window and the next word. This link can be more complex than a Markov chain but must be of the same kind: lacking understanding whatsoever and communication intent (no representation of the concept / sentences that are required to reply correctly). How it is possible to believe in this, today? And, check yourself what the top AI scientists today believe about the correctness of the stochastic parrot hypothesis.
> > Text generated by an LM is not grounded in communicative intent
> This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.
That's not correct. It's clear from the surrounding paragraphs what Bender et al mean by this phrase. They mean that LLMs lack the capacity to form intentions.
> You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models.
No, I'm not. I haven't, in fact, made any claims about the "stochastic parrot". Rather, I've asked whether your characterisation of AI researchers' views is accurate, and suggested some reasons why it may not be.