Hacker News

readitalready 2 months ago [ - ]

LLMs alone aren't the way to AGI. Perhaps something involving a merge of diffusion or other models that are based on more sensory elements, like images, time, and motion, but LLMs alone aren't going to get us there.

The end of the exponential means the start of other models.

rishabhaiover 2 months ago [ - ]

> LLMs alone aren't the way to AGI

Pretraining + RL works, there is no clear evidence that it doesn't scale further.

readitalready 2 months ago [ - ]

Pretraining + RL itself is the scaling limit. If you feed it the entire dataset before 1905, LLMs aren't going to come up with general relativity. It has no concept of physics, or time even.

AGI happens when you DON'T need to scale pertaining + RL.

acuozzo 2 months ago [ - ]

> If you feed it the entire dataset before 1905, LLMs aren't going to come up with general relativity.

Link?

Jensson 2 months ago [ - ]

You don't need a source for that, an LLM with such little data is barely able to form proper sentences.

acuozzo 2 months ago [ - ]

> an LLM with such little data

There is a mountain of data pre-1905. Certainly enough to train a decent 30B parameter model.

Now, digitizing & OCRing all of that data... THAT is a challenge.

rishabhaiover 2 months ago [ - ]

AGI maybe not, but it is reaching disruption level intelligence in the SWE domain.