> A year or more ago, I read that both Anthropic and OpenAI were losing money on every single request even for their paid subscribers
This gets repeated everywhere but I don't think it's true.
The company is unprofitable overall, but I don't see any reason to believe that their per-token inference costs are below the marginal cost of computing those tokens.
It is true that the company is unprofitable overall when you account for R&D spend, compensation, training, and everything else. This is a deliberate choice that every heavily funded startup should be making, otherwise you're wasting the investment money. That's precisely what the investment money is for.
However I don't think using their API and paying for tokens has negative value for the company. We can compare to models like DeepSeek where providers can charge a fraction of the price of OpenAI tokens and still be profitable. OpenAI's inference costs are going to be higher, but they're charging such a high premium that it's hard to believe they're losing money on each token sold. I think every token paid for moves them incrementally closer to profitability, not away from it.
The reports I remember show that they're profitable per-model, but overlap R&D so that the company is negative overall. And therefore will turn a massive profit if they stop making new models.
* stop making new models and people keep using the existing models, not switch to a competitor still investing in new models.
Doesn’t it also depend on averaging with free users?
I can see a case for omitting R&D when talking about profitability, but training makes no sense. Training is what makes the model, omitting it is like omitting the cost of running the production facility of a car manufacturer. If AI companies stop training they will stop producing models, and they will run out of a products to sell.
The reason for this is that the cost scales with the model and training cadence, not usage and so they will hope that they will be able to scale number of inference tokens sold both by increasing use and/or slowing the training cadence as competitors are also forced to aim for overall profitability.
It is essentially a big game of venture capital chicken at present.
It depends on what you're talking about
If you're looking at overall profitability, you include everything
If you're talking about unit economics of producing tokens, you only include the marginal cost of each token against the marginal revenue of selling that token
I don’t understand the logic. Without training the marginal cost of each token goes into nothing. The more you train, the better the model, and (presumably) you will gain more costumer interest. Unlike R&D you will always have to train new models if you want to keep your customers.
To me this looks likes some creative bookkeeping, or even wishful thinking. It is like if SpaceX omits the price of the satellites when calculating their profits.