I haven't seen this discussed here:

So far, the accelerator is showing cost savings of roughly 50% compared with typical AI graphics processing units, Broadcom Chief Executive Officer Hock Tan said in an interview. - [0]

50% cost saving. The picture changes so quickly, there are still a lot of low hanging fruits, that I find any discussion about whether a vendor has moats, or if they can recoup investment, is moot and futile.

[0] - https://www.bloomberg.com/news/articles/2026-06-24/openai-an...

If GPUs have 75% margin then 50% cheaper is no surprise.

Operational costs far outweight hardware cost.

Let's use an example of a GW AI deployment.

At $0.07/kWh, that costs $70,000 every hour in just electricity. $1.7 million /day. $613 million /year.

I had claude estimate the GPU cost of such a deployment:

> To get racks per GW: a full NVL72 rack draws roughly 130-132 kW under full load. If a 1 GW facility runs ~715 MW of IT power (after a ~1.4 PUE for cooling), that's on the order of 4,000–4,500 racks. At $3.4M of compute hardware each, the GPU-system cost lands around $14–15 billion.

15 billion / 613 million / year = ~24.5 years til electricity costs catch up to the GPUs. Obviously electricity isn't 100% of OpEx, but I'd expect it to be the majority for AI deployments.

Regardless, if you can cut the $613 million/yr in half that's still massive savings.

Do they? Genuinely ansking.

Yep, I was surprised to learn that too.

For a small cluster no, but at major data center level yes. Which is why they building data centers bigger than stadiums.

If you spend 10B on a data center, roughly 30% of that price is going to hardware, so roughly $ 3B.

So for two data centers you're spending 20B.

Now, assume there's hardware that performs twice as fast at same energy (watt/token), even if it costed you twice you're saving 7B because you don't need the second data center.

You get the same output of $ 20 B out of a $ 13 B initial investment, but you're also halving operational costs: less staff, less lawyers, etc, etc.

This is the reason why Nvidia is making gargantuan margins: hyper scalers don't really care about hardware cost, if they can get double the output and save themselves 30-40% of total costs and 50% of the headaches they will keep buying at twice the price gen over gen.

"Typical" is doing a lot of work there. That could mean much older chips than Nvidia is currently selling.

"Typical" usually means typical, i.e. median. Also they are claiming cost saving, not performance. The saving would even be more impressive if much older chips are less efficient than the newer ones -- costing more to run.