Chess is a bad example. Even a "stupid" computer that is sufficiently powerful can just brute-force-search its way to a win. There's nothing special here, it's basically just deeper and deeper search. Put another way. the limitation was always about sufficiently powerful hardware.

I'm not sure the same can be said about LLMs.

It seems a bit presumptuous that software and hardware would not evolve past May 2025 to improve watts/token over time, or whatever metric you choose. Consumer-grade GPUs didn't really arrive until 1995, and industry didn't really standardize OpenGL until the early 90s, consumer-grade GPUs didn't have OpenGL support until much later. Vulkan didn't come along until 2016. It's mostly an artificial limit that I can't buy a 4070 with 1TB of memory at Best Buy for $1200, or will be, in a year or two. I would expect watts/token to decrease by at least half by the end of the decade.

How do you not see its still just deeper and deeper search?

In a sense, yes, but my point is that it is not a given that making LLMs bigger and bigger will make them qualitatively much better than they currently are.