This is a good line: "It found that smarter entities are subjectively judged to behave less coherently"
I think this is twofold:
1. Advanced intelligence requires the ability to traverse between domain valleys in the cognitive manifold. Be it via temperature or some fancy tunneling technique, it's going to be higher error (less coherent) in the valleys of the manifold than naive gradient following to the local minima.
2. It's hard to "punch up" when evaluating intelligence. When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.
Incoherence is not error.
You can have a vanishingly small error and an incoherence at its max.
That would be evidence of perfect alignment (zero bias) and very low variance.
> the ability to traverse between domain valleys in the cognitive manifold.
Couldn't you have just said "know about a lot of different fields"? Was your comment sarcastic or do you actually talk like that?
I think they mean both "know about a lot of different fields" and also "be able to connect them together to draw inferences", the latter perhaps being tricky?
Maybe? They should speak more clearly regardless, so we don't have to speculate over it. The way you worded it is much more understandable.
There wasn't much room to speculate really, but requires some knowledge of understanding problem spaces, topology, and things like minima and maxima.
"inaccessible" rather than "ambiguous" -- but to the uninitiated they are hard to tell apart.
What do 'domain valleys' and 'tunneling' mean in this context?
So, the hidden mental model that the OP is expressing and failed to elucidate on is that llm’s can be thought of as compressing related concepts into approximately orthogonal subspaces of the vector space that is upper bounded by the superposition of all of their weights. Since training has the effect of compressing knowledge into subspaces, a necessary corollary of that fact is that there are now regions within the vector space that contain nothing very much. Those are the valleys that need to be tunneled through, ie the model needs to activate disparate regions of its knowledge manifold simultaneously, which, seems like it might be difficult to do. I’m not sure if this is a good way of looking at things though, because inference isn’t topology and I’m not sure that abstract reasoning can be reduced down to finding ways to connect concepts that have been learned in isolation.
Not the OP, but my interpretation here is that if you model the replies as some point in a vector space, assuming points from a given domain cluster close to each other, replies that span two domains need to "tunnel" between these two spaces.
A hallmark of intelligence is the ability to find connections between the seemingly disparate.
That's also a hallmark of some mental/psychological illnesses (paranoid schizophrenia family) and use of certain drugs, particularly hallucinogens.
The hallmark of intelligence in this scenario is not just being able to make the connections, but being able to pick the right ones.
The word "seemingly" is doing a lot of work here.
Sometimes things that look very different actually are represented with similar vectors in latent space.
When that happens to us it "feels like" intuition; something you can't really put a finger on and might require work to put into a form that can be transferred to another human that has a different mental model
Actually, a hallmark could be to prune illusory connections, right? That would decrease complexity rather than amplifying it.
Yes, that also happens, for example when someone first said natural disasters are not triggered by offending gods. It is all about making explanations as simple as possible but no simpler.
Does this make conspiracy theorists highly intelligent?
No, but they emulate intelligence by making up connections between seemingly disparate things, where there are none.
They make connections but lack the critical thinking skills to weed out the bad/wrong ones.
Which is why, just occasionally, they're right, but mostly by accident.
> When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.
Insights are “deep” not on their own merit, but because they reveal something profound about reality. Such a revelation is either testable or not. If it’s testable, distinguishing it from bullshit is relatively easy, and if it’s not testable even in principle, a good heuristic is to put it in the bullshit category by default.
This was not my experience studying philosophy. After Kant there was a period where philosophers were basically engaged in a centuries long obfuscated writing competition. The pendulum didn't start to swing back until Neitchze. It reminded me of legal jargon but more pretentious and less concrete.
It seems to me that your anecdote exemplifies the their point.
The issue is the revelation. It's always individual at some level. And don't forget our senses are crude. The best way is to store "insights" as information until we collect enough data that we can test it again (hopefully without a lot of bias). But that can be more than a lifetime work, so sometimes you have to take some insights at face value based on heuristics (parents, teachers, elder, authority,...)