Hacker News

ACCount37 a day ago [ - ]

"Next token prediction" is an interface, not an algorithm. A process that "predicts next tokens" can be arbitrarily complex or simple, and arbitrarily capable or incapable of performing a given task.

Saying that an LLM can or can't do something because it's a "token predictor" is a category error. The interface isn't a hard limit.

IsTom 13 hours ago [ - ]

I'm not sure if it's has any real bearing on real-world performance, but technically next token prediction makes it an online algorithm and they can be provably worse than (good) offline algorithms.

dncornholio 12 hours ago [ - ]

The word "prediction" still holds a lot of weight. LLM's only can predict what has been written. This is a hard limit.

ACCount37 10 hours ago [ - ]

For something like "a hard limit" to hold, LLMs must be restricted to only reproducing existing text. This is utterly false even for base models - their basin seems to be "permutations loosely inspired by existing text".

And that's before all the post-training comes in.

What's the "limit" there?