I think we agree?

What moat? You answered yourself: "capital intensive"

But, history says the supercomputer of today will fit in your pocket in a few years.

They've bought up all the RAM and GPUs, which pushes the capital requirements upward for everyone else. But, they can't corner the market forever, there are too many competing interests. AMD and Intel keep making new GPUs and APUs. The memory makers can't just sell to only AI companies forever, if they do Chinese manufacturers will move in and eventually eat them from below (as has happened many times before).

They have a moat today, and it's just that it's really expensive to train and host frontier models, especially at commercial scale. It used to be there was also some secret sauce to making it fast and efficient. But, secret sauce is being published daily by all sorts of researchers, folks are figuring out how to do more with less and it often finds its way into llama.cpp or vLLM or SGLang within days or weeks.

> But, history says the supercomputer of today will fit in your pocket in a few years.

I don't think this will be true in the same time span anymore. Each miniaturization is costing more and more money.

Perhaps they'll come up with exotic fundamental improvements, but I don't think the rate of improvement of compute/watt will match the previous decades.

Yeah, that's probably true, but we're also seeing that there's still tons of inefficiencies in how LLMs are being run. Seems like every couple months there's some new technique to squeeze more performance out of less hardware. KV caching improvements, fast attention, speculative decoding, dynamic quantization, quantization aware training, etc.

That said, I recently replaced my five year old self-built PC (with a top-of-the-line desktop CPU, chipset, memory, and GPU of the time) with a new everything-the-best build, and while it's clear we're not keeping up with Moore's Law anymore, it's still 4-5 times faster for compute-intensive stuff, especially parallelizable tasks. We're still getting faster/cheaper. So, the time scale is maybe ten years rather than five.

Really the biggest concerns are not computers getting spectacularly faster, but 'intelligence' algorithms getting orders of magnitude better.

Drop the power requirements 1000 fold, and yea you will be able to make your own SOTA model on the cheap. The problem is the person that has a few exaflops of power will still leave you in the dust in the intelligence explosion that would happen after an event like this.

Depends upon the intelligence vs compute scaling law— which I think no one really knows. Pretty likely to be some degree of diminishing returns, but how much? Is it logarithmic, inverse quadratic, …

If training models gets way cheaper, I would expect the diminishing returns to get steeper too.

>Pretty likely to be some degree of diminishing returns

intelligence may be different. If we look at biological brains - do we get diminishing returns or completely opposite scaling law when we compare our brain against say gorilla's ?

Interesting thought to consider in principle but fails because gorilla brains continued to evolve too, just along a different path. They're not snapshots of ancestral species locked in time.

Single clock speed hasn't had much of an upgrade, but the architecture for doing exactly what they are doing? That will improve for at least 5-10 years. There are both huge power gains from Processing in Memory (PIM) chips (70-80% discount in energy), and improvements to engineering to make memory cheaper and cheaper.

Yes, I'm talking about a supercomputer from today in your pocket. That probably requires at least 5000x perf/watt if not even more.

>but I don't think the rate of improvement of compute/watt will match the previous decades.

Unless we invest heavily in research and find new way to do chips. But I think there's not enough motivation and money to do that.

There's literally never been more money being thrown at that problem.

> I think we agree?

That is such a crazy way to start a response to someone trying to argue with you. I should try this. That's amazing. I know you didn't mean it as a trick, at least I'm pretty sure you meant it sincerely, but I'm just struck by the power of it to defuse and redirect the conversation. And this was a very low-grade example, but I could imagine this being useful in much more heated contexts.

I think in general stripping away the parts you agree with from the argument works great, because it strips away a whole lot of potential for ending up indirectly arguing over things that aren't in contention, and it often also defuses the rest when it turns out the core of the argument perhaps is much smaller than people are willing to get invested in.

How do you do that without sounding negative? Because by doing that there's the risk of the general impression "we didn't agree", as you basically focused on the disagreements.

OTOH I have often witnessed people agreeing without realizing it. I‘ve been able to defuse a bunch of arguments by pointing that out.

Yeah, more valuable than the comments I came to read (even if those are interesting too!)

Usually people are taught these techniques at the management courses. If you're at a BigCorp where they push managers through such courses - you can hear a lot of that stuff in your manager's speech if you pay attention to it.

[dead]

They’ve bought up all the RAM and GPUs…

Is there an endgame where even this is considered overly complex? Instead of starving the competition by buying up all the compute, why not just buy up… all the money!? Hoover up as much investment capital as possible so that your competitors can’t get funding.

I assume this is an honest question, in which case the answer is funding is not really finite.

or just "buy" your competition like big tech did

every major tech company literally have deal,ownership,alliance etc

they literally not gonna gobble up entirely to trigger anti-trust case

The other half of the moat is the data they stole from everyone else, some of it illegally. So, be sure they will do everything in their power to stop others from getting that data freely.

Yeah, I think a lot of the "slow down" rumblings we're hearing from OpenAI and Anthropic are really overtures toward regulatory capture; basically, "now that we're in the lead, we need to lock this shit down so nobody else can catch up."

but.. OpenAI and Anthropic can't stop China and EU, can they?

Depends on your world view, they might or might not come up with something better. but I guess we can agree nothing with stop them from _trying_?

US successfully enforce DMCA and other copyright stuff on EU while giving free pass to own bigtech now.

China will certainly compete though.

>But, history says the supercomputer of today will fit in your pocket in a few years.

That was Moore's law saying that. And it seems Moore's law slowed down quite a bit for now.

Yes, but surely AI are going to save us from the bloated stack of modern software.

"But, history says the supercomputer of today will fit in your pocket in a few years."

hmm nooo ??, physic says otherwise