Hacker News

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

There any many places that will not use models running on hardware provided by OpenAI / Anthropic. That is the case true of my (the Australian) government at all levels. They will only use models running in Australia.

Consequently AWS (and I presume others) will run models supplied by the AI companies for you in their data centres. They won't be doing that at a loss, so the price will cover marginal cost of the compute plus renting the model. I know from devs using and deploying the service demand outstrips supply. Ergo, I don't think there is much doubt that they are making money from inference.

deaux 2 months ago [ - ]

> Consequently AWS (and I presume others) will run models supplied by the AI companies for you in their data centres. They won't be doing that at a loss, so the price will cover marginal cost of the compute plus renting the model.

This says absolutely nothing.

Extremely simplified example: let's say Sonnet 4.5 really costs $17/1M output for AWS to run yet it's priced at $15. Anthropic will simply have a contract with AWS that compensates them. That, or AWS is happy to take the loss. You said "they won't be doing that at a loss" but in this case it's not at all out of the question.

Whatever the case, that it costs the same on AWS as directly from Anthropic is not an indicator of unit economics.

waffletower 2 months ago [ - ]

In the case of Anthropic -- they host on AWS all the while their models are accessible via AWS APIs as well, the infrastructure between the two is likely to be considerably shared. Particularly as caching configuration and API limitations are near identical between Anthropic and Bedrock APIs invoking Anthropic models. It is likely a mutually beneficial arrangement which does not necessarily hinder Anthropic revenue.

freakynit 2 months ago [ - ]

Genuine question: Given Anthropic's current scale and valuation, why not invest in owning data centers in major markets rather than relying on cloud providers?

Is the bottleneck primarily capex, long lead times on power and GPUs, or the strategic risk of locking into fixed infrastructure in such a fast-moving space?