I have not see any reporting or evidence at all that Anthropic or OpenAI is able to make money on inference yet.
> Turns out there was a lot of low-hanging fruit in terms of inference optimization that hadn't been plucked yet.
That does not mean the frontier labs are pricing their APIs to cover their costs yet.
It can both be true that it has gotten cheaper for them to provide inference and that they still are subsidizing inference costs.
In fact, I'd argue that's way more likely given that has been precisely the goto strategy for highly-competitive startups for awhile now. Price low to pump adoption and dominate the market, worry about raising prices for financial sustainability later, burn through investor money until then.
What no one outside of these frontier labs knows right now is how big the gap is between current pricing and eventual pricing.
It's quite clear that these companies do make money on each marginal token. They've said this directly and analysts agree [1]. It's less clear that the margins are high enough to pay off the up-front cost of training each model.
[1] https://epochai.substack.com/p/can-ai-companies-become-profi...
It’s not clear at all because model training upfront costs and how you depreciate them are big unknowns, even for deprecated models. See my last comment for a bit more detail.
They are obviously losing money on training. I think they are selling inference for less than what it costs to serve these tokens.
That really matters. If they are making a margin on inference they could conceivably break even no matter how expensive training is, provided they sign up enough paying customers.
If they lose money on every paying customer then building great products that customers want to pay for them will just make their financial situation worse.
"We lose money on each unit sold, but we make it up in volume"
By now, model lifetime inference compute is >10x model training compute, for mainstream models. Further amortized by things like base model reuse.
Those are not marginal costs.
> They've said this directly and analysts agree [1]
chasing down a few sources in that article leads to articles like this at the root of claims[1], which is entirely based on information "according to a person with knowledge of the company’s financials", which doesn't exactly fill me with confidence.
[1] https://www.theinformation.com/articles/openai-getting-effic...
"according to a person with knowledge of the company’s financials" is how professional journalists tell you that someone who they judge to be credible has leaked information to them.
I wrote a guide to deciphering that kind of language a couple of years ago: https://simonwillison.net/2023/Nov/22/deciphering-clues/
Unfortunately tech journalists' judgement of source credibility don't have a very good track record
But there are companies which are only serving open weight models via APIs (ie. they are not doing any training), so they must be profitable? here's one list of providers from OpenRouter serving LLama 3.3 70B: https://openrouter.ai/meta-llama/llama-3.3-70b-instruct/prov...
It's also true that their inference costs are being heavily subsidized. For example, if you calculate Oracles debt into OpenAIs revenue, they would be incredibly far underwater on inference.
Sue, but if they stop training new models, the current models will be useless in a few years as our knowledge base evolves. They need to continually train new models to have a useful product.
> they still are subsidizing inference costs.
They are for sure subsidising costs on all you can prompt packages (20-100-200$ /mo). They do that for data gathering mostly, and at a smaller degree for user retention.
> evidence at all that Anthropic or OpenAI is able to make money on inference yet.
You can infer that from what 3rd party inference providers are charging. The largest open models atm are dsv3 (~650B params) and kimi2.5 (1.2T params). They are being served at 2-2.5-3$ /Mtok. That's sonnet / gpt-mini / gemini3-flash price range. You can make some educates guesses that they get some leeway for model size at the 10-15$/ Mtok prices for their top tier models. So if they are inside some sane model sizes, they are likely making money off of token based APIs.
> They are being served at 2-2.5-3$ /Mtok. That's sonnet / gpt-mini / gemini3-flash price range.
The interesting number is usually input tokens, not output, because there's much more of the former in any long-running session (like say coding agents) since all outputs become inputs for the next iteration, and you also have tool calls adding a lot of additional input tokens etc.
It doesn't change your conclusion much though. Kimi K2.5 has almost the same input token pricing as Gemini 3 Flash.
most of those subscriptions go unused. I barely use 10% of mine
so my unused tokens compensate for the few heavy users
Ive been thinking about our company, one of big global conglomerates that went for copilot. Suddenly I was just enrolled.. together with at least 1500 others. I guess the amount of money for our business copilot plans x 1500 is not a huge amount of money, but I am at least pretty convinced that only a small part of users use even 10% of their quota. Even teams located around me, I only know of 1 person that seems to use it actively.
Thanks!
I hope my unused gym subscription pays back the good karma :-)
> I have not see any reporting or evidence at all that Anthropic or OpenAI is able to make money on inference yet.
Anthropic planning an IPO this year is a broad meta-indicator that internally they believe they'll be able to reach break-even sometime next year on delivering a competitive model. Of course, their belief could turn out to be wrong but it doesn't make much sense to do an IPO if you don't think you're close. Assuming you have a choice with other options to raise private capital (which still seems true), it would be better to defer an IPO until you expect quarterly numbers to reach break-even or at least close to it.
Despite the willingness of private investment to fund hugely negative AI spend, the recently growing twitchiness of public markets around AI ecosystem stocks indicates they're already worried prices have exceeded near-term value. It doesn't seem like they're in a mood to fund oceans of dotcom-like red ink for long.
>Despite the willingness of private investment to fund hugely negative AI spend
VC firms, even ones the size of Softbank, also literally just don't have enough capital to fund the planned next-generation gigawatt-scale data centers.
IPO'ing is often what you do to give your golden investors an exit hatch to dump their shares on the notoriously idiotic and hype driven public.
> evidence at all that Anthropic or OpenAI is able to make money on inference yet.
The evidence is in third party inference costs for open source models.