You can’t really say it is just predicting continuations when it is learning to write proofs for Erdos problems, formalise significant math results, or perform automated AI research. Those are far beyond what you get by just being a copying and re-forming machine, a lot of these problems require sophisticated application of logic.

I don’t know if this can reach AGI, or if that term makes any sense to begin with. But to say these models have not learnt from their RL seems a bit ludicrous. What do you think training to predict when to use different continuations is other than learning?

I would say LLM’s failure cases like failing at riddles are more akin to our own optical illusions and blind spots rather than indicative of the nature of LLMs as a whole.

I think you're conflating mechanism with function/capability.

I'm not sure what I wrote that made you conclude that I thought these models are not learning anything from their RL training?! Let me say it again: they are learning to steer towards reasoning steps that during training led to rewards.

The capabilities of LLMs, both with and without RL, are a bit counter-intuitive, and I think that, at least in part, comes down to the massive size of the training sets and the even more massive number of novel combinations of learnt patterns they can therefore potentially generate...

In a way it's surprising how FEW new mathematical results they've been coaxed into generating, given that they've probably encountered a huge portion of mankind's mathematical knowledge, and can potentially recombine all of these pieces in at least somewhat arbitrary ways. You might have thought that there are results A, B and C hiding away in some obscure mathematical papers that no human has previously considered to put together before (just because of the vast number of such potential combinations), that might lead to some interesting result.

If you are unsure yourself about whether LLMs are sufficient to reach AGI (meaning full human-level intelligence), then why not listen to someone like Demis Hassabis, one of the brightest and best placed people in the field to have considered this, who says the answer is "no", and that a number of major new "transformer-level" discoveries/inventions will be needed to get there.

> What do you think training to predict when to use different continuations is other than learning?

Sure, training = learning, but the problem with LLMs is that is where it stops, other than a limited amount of ephemeral in-context learning/extrapolation.

With an LLM, learning stops post-training when it is "born" and deployed, while with an animal that's when it starts! The intelligence of an animal is a direct result of it's lifelong learning, whether that's imitation learning from parents and peers (and subsequent experimentation to refine the observed skill), or the never ending process of observation/prediction/surprise/exploration/discovery which is what allows humans to be truly creative - not just behaving in ways that are endless mashups of things they have seen and read about other humans doing (cf training set), but generating truly novel behaviors (such as creating scientific theories) based on their own directed exploration of gaps in mankind's knowledge.

Application of AGI to science and new discovery is a large part of why Hassabis defines AGI as human-equivalent intelligence, and understands what is missing, while others like Sam Altman are content to define AGI as "whatever makes us lots of money".

[dead]