> If a language model was just a stochastic parrot, when we looked inside to see what was going on, we’d basically find a lookup table
I disagree right away. There are more sophisticated probability models than lookup tables.
> It'd be running a search for the most similar pattern in its training data and copying this.
Also untrue. Sophisticated probability models combine probabilities based on combining all the bits of context, and by fuzzing similar tokens together via compressing (i.e. you don't care what particular token is used, just that a similar one is used.)
They're parrots, just better parrots than this person can conceive of.