Hacker News

svara 6 days ago [ - ]

Insightful and thanks for the comment, but I'm not sure I'm getting to the same conclusion as you. I think I lost you at:

> It's not just that all models are wrong and some are useful but that many models are useful but wrong. What used to be considered edge cases do not stay ...

That's not a contradiction? That popular quote says it right there: "all models are wrong". There is no model of reality, but there's a process for refining models that generates models that enable increasingly good predictions.

It stands to reason that an ideal next-token predictor would require an internal model of the world at last equally as powerful as our currently most powerful scientific theories. It also stands to reason that this model can, in principle, be trained from raw observational data, because that's how we did it.

And conversely, it stands to reason that a next-token predictor as powerful as the current crop of LLMs contains models of the world substantially more powerful than the models that powered what we used to call autocorrect.

Do you disagree with that?

godelski 6 days ago [ - ]

  > That's not a contradiction?

Correct. No contradiction was intended. As you quote, I wrote "It's not just that". This is not setting up a contrasting point, this is setting up a point that follows. Which, as you point out, does follow. So let me rephrase

  > If all models are wrong but some are useful then this similarly means that all useful models are wrong in some way.

Why flip it around? To highlight the part where they are incorrect as this is what is the thesis of my argument.

With that part I do not disagree.

  > It stands to reason that an ideal next-token predictor would require an internal model of the world at last equally as powerful as our currently most powerful scientific theories.

With this part do not agree. There's not only the strong evidence I previously mentioned that demonstrates this happening in history, but we can even see the LLMs doing it today. We can see them become very good predictors yet the world that they model for is significantly different from the one we live in. Here's two papers studying exactly that![0,1]

To help make this clear, we really need to understand that you can't have a "perfect" next-token predictor (or any model). To "perfectly" generate the next token would require infinite time, energy, and information. You can look at this through the point of view as the Bekenstein bound[2], the Data Processing Inequality theorem[3], or even the No Free Lunch Theorem[4]. While I say you can't make a "perfect" predictor, that doesn't mean you can't get 100% accuracy on some test set. That is a localization, but as those papers show, one doesn't need to have an accurate world model to get such high accuracies. And as history shows, we don't only make similar mistakes but (this is not a contradiction, rather it follows the previous statement) we are resistant to updating our model. And for good reason! Because it is hard to differentiate models which make accurate predictions.

I don't think you realize you're making some jumps in logic. Which I totally understand, they are subtle. But I think you will find them if you get really nitpicky with your argument making sure that one thing follows from another. Make sure to define everything: e.g. next-token predictor, a prediction, internal model, powerful, and most importantly how we did it.

Here's where your logic fails:

You are making the assumption that given some epsilon bound on accuracy, that there will only be one model which accurate to that bound. Or, in other words, there is only one model that makes perfect predictions so by decreasing model error we must converge to that model.

The problem with this is that there are an infinite number of models that make accurate predictions. As a trivial example, I'm going to redefine all addition operations. Instead of doing "a + b" we will now do "2 + a + b - 2". The operation is useless, but it will make accurate calculations for any a and b. There are much more convoluted ways to do this where it is non-obvious that this is happening.

When we get into the epsilon-bound issue, we have another issue. Let's assume the LLM makes as accurate predictions as humans. You have no guarantee that they fail in the same way. Actually, it would be preferable if the LLMs fail in a different way than humans, as the combined efforts would then allow for a reduction of error that neither of us could achieve.

And remember, I only made the claim that you can't prove something correct simply through testing. That is, empirical evidence. Bekenstein's Bound says just as much. I didn't say you can't prove something correct. Don't ignore the condition, it is incredibly important. You made the assumption that we "did it" through "raw observational data" alone. We did not. It was an insufficient condition for us, and that's my entire point.

[0] https://arxiv.org/abs/2507.06952

[1] https://arxiv.org/abs/2406.03689

[2] https://en.wikipedia.org/wiki/Bekenstein_bound

[3] https://en.wikipedia.org/wiki/Data_processing_inequality

[4] https://en.wikipedia.org/wiki/No_free_lunch_theorem

svara 6 days ago [ - ]

If I take what you just wrote together with the comment I first reacted to, I believe I understand what you're saying as the following: Of a large or infinite number of models, which in limited testing have equal properties, only a small subset will contain actual understanding, a property that is independent of the model's input-output behavior?

If that's indeed what you mean, I don't think I can agree. In your 2+a+b-2 example, that is an unnecessarily convoluted, but entirely correct model of addition.

Epicycles are a correct model of celestial mechanics, in the limited sense of being useful for specific purposes.

The reason we call that model wrong is that it has been made redundant by a different model that is strictly superior - in the predictions it makes, but also in the efficiency of its teaching.

Another way to look at it is that understanding is not a property of a model, but a human emotion that occurs when a person discovers or applies a highly compressed representation of complex phenomena.

godelski 6 days ago [ - ]

  > only a small subset will contain actual understanding, a property that is independent of the model's input-output behavior?

I think this is close enough. I'd say "a model's ability to make accurate predictions is not necessarily related to the model's ability to generate counterfactual predictions."

I'm saying, you can make extremely accurate predictions with an incorrect world model. This isn't conjecture either, this is something we're extremely confident about in science.

  > I don't think I can agree. In your 2+a+b-2 example, that is an unnecessarily convoluted, but entirely correct model of addition.

I gave it as a trivial example, not as a complete one (as stated). So be careful with extrapolating limitations of the example with limitations of the argument. For a more complex example I highly suggest looking at the actual history around the heliocentric vs geocentric debate. You'll have to make an active effort to understand this because what you were taught in school is very likely an (very reasonable) over simplification. Would you like a much more complex mathematical example? It'll take a little to construct and it'll be a lot harder to understand. As a simple example you can always take a Taylor expansion of something so you can approximate it, but if you want an example that is wrong and not through approximation then I'll need some time (and a specific ask).

Here's a pretty famous example with Freeman Dyson recounting an experience with Fermi[0]. Dyson's model made accurate predictions. Fermi is able to accurately dismiss Dyson's idea quickly despite strong numerical agreement between the model and the data. It took years to determine that despite accurate predictions it was not an accurate world model.

*These situations are commonplace in science.* Which is why you need more than experimental agreement. Btw, experiments are more informative than observations. You can intervene in experiments, you can't in observations. This is a critical aspect to discovering counterfactuals.

If you want to understand this deeper I suggest picking up any book that teaches causal statistics or any book on the subject of metaphysics. A causal statistics book will teach you this as you learn about confounding variables and structural equation modeling. For metaphysics Ian Hacking's "Representing and Intervening" is a good pick, as well as Polya's famous "How To Solve It" (though it is metamathematics).

[0] (Mind you, Dyson says "went with the math instead of the physics" but what he's actually talking about is an aspect of metamathematics. That's what Fermi was teaching Dyson) https://www.youtube.com/watch?v=hV41QEKiMlM

svara 5 days ago [ - ]

Side note, it's not super helpful to tell me what I need to study in broad terms without telling me about the specific results that your argument rests on. They may or may not require deep study, but you don't know what my background is and I don't have the time to go read a textbook just because someone here tells me that if I do, I'll understand how my thinking is wrong.

That said, I really do appreciate this exchange and it has helped me clarify some ideas, and I appreciate the time it must take you to write this out. And yes, I'll happily put things on my reading list if that's the best way to learn them.

Let me offer another example that I believe captures more clearly the essence of what you're saying: A model that learns addition from everyday examples might come up with an infinite number of models like mod(a+b, N), as long as N is extremely large.

(Another side note, I think it's likely that something like this does in fact happen in currently SOTA AI.)

And, the fact that human physicists will be quick to dismiss such a model is not because it fails on data, but because it fails a heuristic of elegance or maybe naturalness.

But, those heuristics in turn are learnt from data, from the experience of successful and failing experiments aggregated over time in the overall culture of physics.

You make a distinction between experiment and observation - if this was a fundamental distinction, I would need to agree with your point, but I don't see how it's fundamental.

An experiment is part of the activity of a meta-model, a model that is trained to create successful world models, where success is narrowly defined as making accurate physical predictions.

This implies that the meta-model itself is ultimately trained on physical predictions, even if its internal heuristics are not directly physical and do not obviously follow from observational data.

In the Fermi anecdote that you offer, Fermi was talking from that meta-model perspective - what he said has deep roots in the culture of physics, but what it really is is a successful heuristic; experimental data that disagree with an elegant model would still immediately disprove the model.

godelski 5 days ago [ - ]

  > without telling me about the specific results that your argument rests on

We've been discussing it the whole time. You even repeated it in the last comment.

  A model that is accurate does not need to be causal

By causal I mean that the elements involved are directly related. We've seen several examples. The most complex one I've mentioned is the geocentric model. People made very accurate predictions with their model despite their model being wrong. I also linked two papers on the topic giving explicit examples where a LLM's world model was extracted and found to be inaccurate (and actually impossible) despite extremely high accuracy.

If you're asking where in the books to find these results, pick up Hacking's book, he gets into it right from the get go.

  > is not because it fails on data, but because it fails a heuristic of elegance or maybe naturalness.

With your example it is very easy to create examples where it fails on data.

A physicist isn't rejecting the model because of lack of "naturalness" or "elegance", they are rejecting it because it is incorrect.

  > You make a distinction between experiment and observation

Correct. Because while an observation is part of an experiment an experiment has much more than an observation. Here's a page that goes through interventional statistics (and then moves into counterfactuals)[0]. Notice that to do this you can't just be an observer. You can't just watch (what people often call "natural experiments"), you have to be an active participant. There's a lot of different types of experiments though.

  > This implies that the meta-model itself is ultimately trained on physical predictions

While yes, physical predictions are part of how humans created physics, it wasn't the only part.

That's the whole thing here. THERE'S MORE. I'm not saying "you don't need observation" I'm saying "you need more than observations". Don't confuse this. Just because you got one part right doesn't mean all of it is right.

[0] https://www.inference.vc/causal-inference-2-illustrating-int...