Any serious LLM user will tell you that there's no way to get from LLM to AGI.

These models are vast and, in many ways, clearly superhuman. But they can't venture outside their training data, not even if you hold their hand and guide them.

Try getting Suno to write a song in a new genre. Even if you tell it EXACTLY what you want, and provide it with clear examples, it won't be able to do it.

This is also why there have been zero-to-very-few new scientific discoveries made by LLM.

Most humans aren't making new scientific discoveries either, are they? Does that mean they don't have AGI?

Intelligence is mostly about pattern recognition. All those model weights represent patterns, compressed and encoded. If you can find a similar pattern in a new place, perhaps you can make a new discovery.

One problem is the patterns are static. Sooner or later, someone is going to figure out a way to give LLMs "real" memory. I'm not talking about keeping a long term context, extending it with markdown files, RAG, etc. like we do today for an individual user, but updating the underlying model weights incrementally, basically resulting in a learning, collective memory.

Can most people venture outside their training data?

Are you seriously comparing chips running AI models and human brains now???

Last time I checked the chips are not rewiring themselves like the brain does, nor does even the software rewrite itself, or the model recalibrate itself - anything that could be called "learning", normal daily work for a human brain.

Also, the models are not models of the world, but of our text communication only.

Human brains start by building a model of the physical world, from age zero. Much later, on top of that foundation, more abstract ideas emerge, including language. Text, even later. And all of it on a deep layer of a physical world model.

The LLM has none of that! It has zero depth behind the words it learned. It's like a human learning some strange symbols and the rules governing their appearance. The human will be able to reproduce valid chains of symbols following the learned rules, but they will never have any understanding of those symbols. In the human case, somebody would have to connect those symbols to their world model by telling them the "meaning" in a way they can already use. For the LLM that is not possible, since it doesn't habe such a model to begin with.

How anyone can even entertain the idea of "AGI" based on uncomprehending symbol manipulation, where every symbol has zero depth of a physical world model, only connections to other symbols, is beyond me TBH.

In some ways no, because to learn something you have to LEARN that then thats in the training data. But humans can do it continuously and sometimes randomly, and also being without prompted.

If you're a scientist -- and in many cases if you're an engineer, or a philosopher, or even perhaps a theologian -- your job is quite literally to add to humanity's training data.

I'd add that fiction is much more complicated. LLMs can clearly write original fiction, even if they are, as yet, not very good at it. There's an idea (often attributed to John Gardner or Leo Tolstoy) that all stories boil down to one of two scenarios:

> "A stranger comes to town."

> "A person goes on a journey."

Christopher Booker wrote that there are seven: https://en.wikipedia.org/wiki/The_Seven_Basic_Plots

So I'd tentatively expect tomorrow's LLMs to write good fiction along those well-trodden paths. I'm less sanguine about their applications in scientific invention and in producing original music.

Yes, they can.

Ever heard of creativity?

I mean yeah, but that's why there are far more research avenues these days than just pure LLMs, for instance world models. The thinking is that if LLMs can achieve near-human performance in the language domain then we must be very close to achieving human performance in the "general" domain - that's the main thesis of the current AI financial bubble (see articles like AI 2027). And if that is the case, you still want as much compute as possible, both to accelerate research and to achieve greater performance on other architectures that benefit from scaling.

How does scaling compute does not go hand-in-hand with energy generation? To me, scaling one and not the other puts a different set of constraints on overall growth. And the energy industry works at a different pace than these hyperscalars scaling compute.

The other thing here is we know the human brain learns on far less samples than LLMs in their current form. If there is any kind of learning breakthrough then the amount of compute used for learning could explode overnight