For a while, some people dismissed language models as “stochastic parrots”. They said models could just memorise statistical patterns, which they would regurgitate back to users.
…
The problem with this theory, is that, alas, it isn’t true.
If a language model was just a stochastic parrot, when we looked inside to see what was going on, we’d basically find a lookup table. … But it doesn’t look like this.
But does that matter? My understanding is that, if you don’t inject randomness (“heat”) into a model while it’s running, it will always produce the same output for the same input. In effect, a lookup table. The fancy stuff happening inside that the article describes is, in effect, [de]compression of the lookup table.
Of course, maybe that’s all human intelligence is too (the whole ‘free will is an illusion in a deterministic universe’ argument is all about this) - but just because the internals are fancy and complicated doesn’t mean it’s not a lookup table.
Everything can be represented as a lookup table. Well, at least everything we can rigorously reason about. Because set theory can serve as a foundation of mathematics. And relations there are sets of pairs (essentially lookup tables).
I guess it means that we can throw away the notion that "it can be represented as a lookup table" has some profound meaning. Without further clarifications, at least. Finite/infinite lookup table, can/can't be constructed in time polynomial in the number of entries. Things like that.