LLMs are still next token predictors, just because you can give it more vague instructions and it still finds the right steps to follow, it doesn't mean it's intelligent. It means you're speaking the same language as the harness they trained your model on.
And that has a limit. If you are stuck at PoC level or simple apps, you have no idea how limited the current models still are. There you really need to break tasks down, not just trust a token predictor to list steps that sound good. There has to be a human in the loop somewhere, because by the time you start skipping permissions, best case you get the jackpot, more likely is you get a suboptimal solution and token waste and what's genuinely still terrifying when the model ignores instructions and does some stupid nonsense, ruining your day. It really is as sharp as a CNC machine. It's not not useful, but could be dangerous, so maybe don't try to carve wood with a monster machine, or park your Ferrari in that crammed neighbourhood if you don't know how to parallel park.
"Next token prediction" is an interface, not an algorithm. A process that "predicts next tokens" can be arbitrarily complex or simple, and arbitrarily capable or incapable of performing a given task.
Saying that an LLM can or can't do something because it's a "token predictor" is a category error. The interface isn't a hard limit.
I'm not sure if it's has any real bearing on real-world performance, but technically next token prediction makes it an online algorithm and they can be provably worse than (good) offline algorithms.
The word "prediction" still holds a lot of weight. LLM's only can predict what has been written. This is a hard limit.
For something like "a hard limit" to hold, LLMs must be restricted to only reproducing existing text. This is utterly false even for base models - their basin seems to be "permutations loosely inspired by existing text".
And that's before all the post-training comes in.
What's the "limit" there?
> it doesn't mean it's intelligent
I'm not sure how you're defining "intelligent", but I'd like to know how it is able to exclude a language model, while still including humans, without simply defining it with an axiom that predefines LLMs as lacking intelligence.
Intelligence is the complete opposite of an LLM. Usually the more you needed to memorize to do something the less intelligent you were considered.
It was also not considered to be a different route to the same thing, but more like fraud.
Also conceptually I could just write the weights on paper and do the billion multiplications on paper without any computer, does that mean I am the paper or the numbers or what??
> Intelligence is the complete opposite of an LLM. Usually the more you needed to memorize to do something the less intelligent you were considered.
Contrary to popular belief, training a LLM is not just about memorization (overfitting). There is some memorization happening, but well-trained LLMs also generalize.
> Intelligence is the complete opposite of an LLM
Like I said, "without simply defining it with an axiom that predefines LLMs as lacking intelligence"
Intelligent humans are capable of following diverse and intricate analogies and draw lessons from seemingly unrelated events. Try asking an LLM to summarize an article and use an imprecise way to state your view. Ask it to push back. You will be drawn into so many pedantic arguments that burn through your tokens within a few messages, you'd wonder if there's someone deliberately taking over the keyboard on their side and spending your token limit. This would never happen with an intelligent human being unless they have nothing better to do and want to troll. This is a speech pattern that LLMs are trained on, it's not a show of intelligence. This also applies to LLMs claiming consciousness: The internet is full of people writing about sentience, talking to "superior aliens" in blog posts, forum threads etc. It's the speech pattern that's copied, not actual thoughts and feelings because LLMs perceive, suffer, have aims or dreams...
Agentic systems use LLMs, and they are absolutely able to follow diverse and intricate analogies. I use them frequently to hunt down notoriously difficult to find memory leaks, in codebases too large for a human to read in a single sitting. They are able to not only follow those intricate paths, they're able to discover solutions and apply those solutions. I use these systems quite a bit, and it's nothing like you've described.
[dead]
An LLM has a fixed number of ways it can express itself. we can give it an array of 14 billion options but it still has to chose one to output. Humans have no such limitation.
An LLM does not persist in consciousness from one token to the next. Each generation, happening hundreds of times a second, will be initialized, generate an output, and terminate. Humans are not stateless like an LLM.
You're conflating a singular model with a much larger system, but I want to address some of your points anyway.
> An LLM has a fixed number of ways it can express itself
While deterministic, there is not a fixed number of ways it can express itself, given that we can use settings like temperature to inject randomness into the output.
> An LLM does not persist in consciousness from one token to the next
While a model alone does not update itself to persist some form of history, there are a number of ways to overcome this, e.g. episodic memory, fine-tuning, and other self-improvement systems exist, which can indeed carry forward what you've called "consciousness".
> Humans are not stateless like an LLM.
A single LLM might be stateless, but an agentic system that relies on LLMs is very often not.
> While deterministic, there is not a fixed number of ways it can express itself, given that we can use settings like temperature to inject randomness into the output.
You're missing the point, which is that no matter the process involved. The LLM can only ever output one of the tokens in its token vector. It can't invent a new symbol or character. It can't leave and go build a church. It has to output a little piece of data for you.
You're moving the goalpost. If the definition of intelligence is based on ability to "go build a church", then we've ruled out the vast majority of the animal kingdom from being labeled "intelligent". If you cannot be consistent in your definition of "intelligence", then you cannot have a reliable litmus test for it.
I wasn't trying to make a reliable litmus test for it.
Either way, if you consider animals, LLMs are even more poorly positioned. They can do exactly none of the things my cat can do. An LLM can string together words, but if my cat is intelligent, it's clear that stringing together words is not synonymous with intelligence, since my cat can't do that.
Animals do in fact "string words together", e.g. parrots. You're also misidentifying what "language" is. Language in this context is not just the ability to string word together. Consider a musician, when they learn to play an instrument, they are learning the language of that instrument. Notes are tokens, ensembles are sentences and paragraphs. I'm afraid you're experiencing conformational bias, because every piece of evidence presented to you has been dismissed with things like "stringing together words is not synonymous with intelligence, since my cat can't do that".
Yeah, and you’re just a next-word-sayer.
Chinese whispers, simulacra... I don't have the energy to argue after being name called, but you get the point. Yes LLMs are useful in building automatic telling machines, but ask it to do anything more substantial and all you are doing is burning tokens at the altar of Anthropic and hope. That just doesn't fly in regulated industries.
I love this argument. Not because it’s true but because it betrays the posters doubt in their own sentience.
It's impossible for someone to doubt their own sentience. The literal act of doubting is enough to dissipate all doubt. Solipsism is essentially the one certainty that every mind out there has.
Doubting the sentience of machines and even other humans is perfectly fine though. Only empathy allows people to make the leap and assume other humans have souls.
So you posit that humans are solipsistic by default, but some (most?) develop more and realize they’re not the only conscious being out there?
"Realize" is too strong a word. You're the only one who can verify that you're the soul who's staring out at the world through your eyes. For all you know, everyone else could be just biological automatons, golems.
Any leap beyond that is based on empathy. You have a soul, and you are human, therefore other humans could have souls too. It's a spiritual belief. Answers to questions that cannot be answered.
That's the standard Piagetian understanding of child development, yes. Humans do not start out with theory of mind, and are thus inherently solipsistic, but in most cases an understanding that there are other conscious beings with their own thoughts, goals and feelings develops between the ages of 2 and 7.
Developing theory of mind is one of the key milestones in child development.
> Solipsism is essentially the one certainty that every mind out there has.
Not I. I'm just a Boltzmann brain.
I’m not sure what sentience has to do with it.
I mean, conversationally, of course we work a little more like that (I tend to think in whole sentence blocks before I say them but I suppose they assemble themselves largely word-by-word, or word-by-word with a bit of editing).
But right now I am trying to design something -— a physical mechanism with a particular enclosure — that I cannot clearly describe (this makes it hard to research). I designed a previous version without even knowing the words that do, in fact, describe that.
I have a theory about it, animated in my mind, that I can only test by making it.
If I want you to know about it, I can either show you it or work out words to describe it, which will be inadequate to describing it.
The idea for it came from seeing things nobody has ever put into words for me.
"Next-word sayer" doesn't describe any of this process, does it?
(This is also why text-to-CAD is a bullshit idea)
This is wrong. Human thinking and speech isn't autoregressive like LLM inference.
while the how is different, the what has many parallels. E.g. both the brain and LLMs appear to learn distributions of representations, they both develop a hierarchy of those representations, both have early layers that process simple features, with later ones processing more abstract concepts, both predict missing information...
The post I responded to stated that the commenter was just a next-word-sayer, but that's wrong. The similarities you draw aren't really relevant to my reply.
no disrespect intended, however I think my response is relevant, because the broader topic here is whether LLMs and the human mind share similar functions. They both do in fact have a lot of overlapping features, and a fundamental one is predicting next-thing, be that a word, image, or otherwise.
It's not relevant. However, if you want to talk about a broader point, that's ok.
> LLMs appear to learn distributions of representations, they both develop a hierarchy of those representations, both have early layers that process simple features, with later ones processing more abstract concepts, both predict missing information.
This type of superficial comparison isn't very meaningful, it's trivial to liken anything to a human biology in this manner.
A plane and a bird both use wings to produce lift, it doesn't then follow that a bird and a plane are meaningfully similar.
> A plane and a bird both use wings to produce lift, it doesn't then follow that a bird and a plane are meaningfully similar.
The use of Bernoulli's principle to achieve lift is a fundamental and meaningfully similar function of both airplane and bird wings. That functional similarity is well known.
> This type of superficial comparison isn't very meaningful
The comparisons I provided are fundamental to both the human mind and LLMs.. that's pretty darn relevant.. and whether you find that trivial or not is a matter of opinion.
Do you not say your words one-at-a-time like everyone else? Otherwise I can’t see how my comment is “wrong”
Even if you could understand human cognition to the level required to say, confidently, that it’s done one word at a time, it’s likely not! Natural language is not a prerequisite for human intelligence, as evidenced by the fact that we went from primates to commenting on HN.
Natural language is, however, a prerequisite for the existence of LLMs. It’s more similar to methods for storing and retrieving information, like the printing press or a database, than it is to a sentient being.
That’s not to say that LLMs can’t do crazy things, because they already have. Our language can encode a whole lot of information, and it’s incredible that we’ve found a way to distill that so effectively.
Even if you could understand human cognition to the level required to say, confidently, that it’s done one word at a time, it’s likely not!
I think they’re not talking about cognition, but about output: regardless of what may be happening inside your brain, ultimately one word at a time comes out of your mouth, right? And you can’t then unsay it.
When you put it in those terms, LLMs are in exactly the same boat.
Deepseek zero didn’t mix up all languages in something very efficient?
Interesting thought but I assume a lot of samples in the training corpus are examples of translation between languages and the same text in different languages.
Only one word at a time!?! It's time you embrace the way of the diffusion model and hazily refine your entire thought until it's coherent.
> Do you not say your words one-at-a-time like everyone else
You're conflating being autoregressive with being sequential.
Calling LLMs 'next token predictors' is completely reductive and disingenuous; it's true that technically that is what they're doing, but so are you! What people generally mean by this though is that they're just 'predicting the next token of their training [i.e. the internet]'. If you were talking about the raw models, this would actually be true; but the models are post trained, so even this description isn't true at all anymore! Saying they aren't 'intelligent' is both not useful and (imo) wrong. Who cares if it matches your definition of 'intelligent'; it still gets impressive stuff done, much more impressive stuff than you seem to be implying.
What would you say is your benchmark for calling something intelligent?
Can it solve problems.