This sentence is the exact reason laying people off and replacing them with AI doesn’t work.

The fact that machine learning can learn highly detailed patterns is the very reason why AI is so useful. So what you’re saying doesn’t really make much sense

Right but the 'surprising level of detail' can often exhibit itself as exactly not a pattern. There are many jobs where you employ a human not because of the rote/pattern based work, but their ability to handle all the edge cases that are just frequent enough to need them, but not frequent enough for AI to be able to handle. That is the events that in this example would require the AI to ask the human to make some decision for them.

> The fact that machine learning can learn highly detailed patterns is the very reason why AI is so useful.

AI doesn't deal with reality, it deals with tokens. This is why all those vibe-coded harnesses, little more than glue between various text IO interfaces, are several hundreds of thousands of source lines of code.

It's why a SOTA model took 100kSLoK to write a C compiler to compile one specific project.

It's why, when I asked for a simple markdown -> ansi escape codes converter (for terminal output) in Python, SOTA Claude and SOTA ChatGPT both give me +- 150 SLoC when my own LUT-based version came to under 10 lines of code + a LUT.

Reality has a surprising amount of detail, but LLMs don't exist in reality, they exist in a virtual world made up off tokens.

The discretization of those tokens can be manipulated to get any result you want. If it meaningfully benefits the AI to have a more fine-grained discretization, then you can do that. AI only compresses as much as we want it to. I understand your sentiment, but the logical conclusion of what you’re saying is that no form of compression is ever valuable. That’s just not a defensible argument.

All information gets compressed. Even your own perception of reality gets compressed.

Do you exist in reality? Or just in a virtual world made up of sensory signals? Do you have access to the Ding an sich any more than a (multimodal) LLM?

> Do you exist in reality?

Yes.

> Or just in a virtual world made up of sensory signals?

No, definitely reality. Things affect my thought whether I sense them or not.

How would you know? You have no external frame of reference; a virtual world of sensory signals would be identical from your perspective. (I agree that "reality" is the most parsimonious explanation by far, btw, but that's never been the point of the simulation thought experiment.)

I think the more interesting corollary of this article is that if we're living in a simulation, it's an impossibly, improbably detailed one. I really want some compute time on the HPC that's running it.

> How would you know? You have no external frame of reference; a virtual world of sensory signals would be identical from your perspective.

Okay, lets go with that :-)

I might be living in a virtual reality, correct, I have no way of knowing.

What I do know is that the reality I am in is many thousands of times higher in resolution than the reality of the LLM.

As an analogy, the LLM is seeing a downscaled 32x32 pixel image while I see the original 8k image. Whether there is a larger 1b^2 image that I cannot see is not relevant to the question of whether the LLM can see my reality or not - it can't.

Things affect LLMs besides tokens, like ECC errors or cosmic rays? …

Come on, now. That's irrelevant.

Reality is by definition our physical reality, which is about an infinite number of levels more detailed than the, you know, _virtual_ digital world computers exist in.

Whatever world we construct for LLMs, no matter how detailed we make it, will always be a blocky projection of the real domain onto a virtual one.

It follows then that any insight gained in the virtual world is at best a rough approximation which can be quite useful at times but also utterly faulty on occasion.

How often it is useful vs. wrong is (partially) a function of how complete the real-to-virtual approximation for a given domain.

Certain domains, given their limited degrees of freedom, can be quite accurately modeled, such as a subway map.

But many domains cannot, and it's important to be aware of that inherent limitation in digital models including but not limited to LLM """reasoning"""

>Whatever world we construct for LLMs, no matter how detailed we make it, will always be a blocky projection of the real domain onto a virtual one.

I don't know exactly why but I never really understood this argument. Might be some kind of control thing? Because for me it's pretty simple, it's basically free to give access to reality. Just add "sensory organs" as it were. I can argue you can make them perceive reality even better than we (humans) do, just enlarge the audio/video spectrums. Bam...more reality. The whole point of the argument is we're missing information.

Again, I get the need for controlling the environment for what LLM/AI/AGI/whatever will be, but that will always cost more than giving them access to like...reality. Same reason I don't really believe in the whole simulation argument, it's just more expensive all around, loses resolution, let alone control. I don't doubt there will be some people that would indulge in neverending hedonism but not all people. You need to give up control for that.

There are two reasons.

First, reality is continuous whereas the digital world is discrete.

Second, data in the real world is many orders of magnitude more detailed than what we're able to model with today's computers.

> Because for me it's pretty simple, it's basically free to give access to reality. Just add "sensory organs" as it were.

I dunno what you mean by "free". The model is trained on text. To "give" the model sensory organs it would need to be trained on those sensory organs.

Current models can predict text, because that's what the weights represent. Models with sensory organs will need to be trained on the output of those sensory organs.

That sounds close to impossible in the foreseeable future.

>I dunno what you mean by "free".

Reality is free. You don't have to waste any resources to model it, you just need to capture it.

>The model is trained on text.

See in my previous reply:

>LLM/AI/AGI/whatever will be

LLMs don't even have a sense of time because they work differently to a human brain.

Vision and audio is already in use in multimodal LLMs. So it's possible in the past.

Who said anything about vision and audio?

[dead]

In the spirit of the article, what detail in the decision making of layoffs might you be missing?

I expect there's a lot of detail that I'm unaware of relating to running a company (planning; risk; legal; ...) that might make a decision foolish to me, but make sense if given more context.