NVIDIA also seems like a risky bet because Google has their own chips R&D. It's not just another data center buying NVIDIA's GPUs. BTW Apple is following suit regarding chips, don't know if at some point Apple will offer any cloud service or continue to work on their ecosystem only.

Beyond that there are a lot of new chip companies attacking the market: https://news.ycombinator.com/item?id=45686790

This makes sense to me. Where I work our ai team set up a couple h100 cards and are hosting a newer model that uses up around 80GB vram. You can see the gpu utilization on graphana go to like 80% for seconds as it processes a single request. That was very surprising to me. This is $30k worth of hardware that can support only a couple users and maybe only 1 if you have an agent going. Now, maybe we're doing something wrong, but it's hard to imagine anyone is going to make money on hosting billions of dollars of these cards when you're making $20 a month per card. I guess it depends on how active your users are. Hard to imagine anthropic is right side up here.

But was that with batching? It makes a big difference. You can run many requests in parallel on the same card if you're doing LLM inferencing.

While Google has their own chips, they don't really have the market power there to buy up bleeding-edge manufacturing capacity.

Apple on the other hand... (though they're behind in other regards)

But they were just an example. They are many players arising. You can look at this: https://news.ycombinator.com/item?id=45746246 even <https://www.nextsilicon.com/> is not on that list.

I don't see a natural monopoly anymore.