> If tpu's actually breakout as a viable alternative over the next few years
Why haven't they broken out yet, I wonder, if they're more efficient for inference and LLM costs are now weighted towards inference over training?
> If tpu's actually breakout as a viable alternative over the next few years
Why haven't they broken out yet, I wonder, if they're more efficient for inference and LLM costs are now weighted towards inference over training?
You essentially have to run in google to use them and that probably limits their ability to breakout. Anthropic might be doing this deal as a way to shore up their supply chain and cost of both inference and training by leveraging Google's hardware and chip manufacturing expertise.
Several customers like Citadel, run TPUs in their own datacenters (closer to Exchanges)
every tpu thats been made is in use and sold at a high margin, demand is not the issue.
TPUs are not that portable and easy for both inferencing and training. It has since improved a lot with their effort on the torch backend (XLA/TorchTPU) and JAX though.
But as far as i know it currently supports just that + tensorflow (which nobody uses it anymore, least here). And last we tried, so much of our kernels needs rework that it’s not worth the effort.
This may change since ironwood but we haven’t tried that generation.
there are literally not enough tpu's on earth for them to break out, every tpu thats been made is in use, the spike in demand is recent and google has heavy competition for foundry space.
Possibly because they just haven't been able to manufacture enough of them yet to be a viable business to others? They're fighting everyone else for foundry space and time.