These articles (of which there are many) all make the same basic accounting mistakes. You have to include all the costs associated with the model, not just inference compute.
This article is like saying an apartment complex isn’t “losing money” because the monthly rents cover operating costs but ignoring the cost of the building. Most real estate developments go bust because the developers can’t pay the mortgage payment, not because they’re negative on operating costs.
If the cash flow was truly healthy these companies wouldn’t need to raise money. If you have healthy positive cash flow you have much better mechanisms available to fund capital investment other than selling shares at increasingly inflated valuations. Eg issue a bond against that healthy cash flow.
Fact remains when all costs are considered these companies are losing money and so long as the lifespan of a model is limited it’s going to stay ugly. Using that apartment building analogy it’s like having to knock down and rebuild the building every 6 months to stay relevant, but saying all is well because the rents cover the cost of garbage collection and the water bill. That’s simply not a viable business model.
Update Edit: A lot of commentary below re the R&D and training costs and if it’s fair to exclude that on inference costs or “unit economics.” I’d simply say inference is just selling compute and that should be high margin, which the article concludes it is. The issue behind the growing concerns about a giant AI bubble is if that margin is sufficient to cover the costs of everything else. I’d also say that excluding the cost of the model from “unit economics” calculations doesn’t make business/math/economics since it’s literally the thing being sold. It’s not some bit of fungible equipment or long term capital expense when they become obsolete after a few months. Take away the model and you’re just selling compute so it’s really not a great metric to use to say these companies are OK.
> Fact remains when all costs are considered these companies are losing money
You would need to figure out what exactly they are losing money on. Making money on inference is like operating profit - revenue less marginal costs. So the article is trying to answer if this operating profit is positive or negative. Not whether they are profitable as a whole.
If things like cost of maintaining data centres or electricity or bandwidth push them into the red, then yes, they are losing money on inference.
If the things that make them lose money is new R&D then that's different. You could split them up into a profitable inference company and a loss making startup. Except the startup isn't purely financed by VC etc, but also by a profitable inference company.
Yes that's right. The inference costs in isolation are interesting because that speaks to the unit economics of this business: R&D / model training aside, can the service itself be scaled to operate at a profit? Because that's the only hope of all the R&D eventually paying dividends.
One thing that makes me suspect inference costs are coming down is how chatty the models have become lately, often appending encouragement to a checklist like "You can check off each item as you complete them!" Maybe I'm wrong, but I feel if inference was killing them, the responses would become more terse rather than more verbose.
For the top few providers, the training is getting amortized over absurd amount of inference. E.g. Google recently mentioned that they processed 980T tokens over all surfaces in June 2025.
The leaked OpenAI financial projections for 2024 showed about equal amount of money spent on training and inference.
Amortizing the training per-query really doesn't meaningfully change the unit economics.
> Fact remains when all costs are considered these companies are losing money and so long as the lifespan of a model is limited it’s going to stay ugly. Using that apartment building analogy it’s like having to knock down and rebuild the building every 6 months to stay relevant. That’s simply not a viable business model.
To the extent they're losing money, it's because they're giving free service with no monetizaton to a billion users. But since the unit costs are so low, monetizing those free users with ads will be very lucrative the moment they decide to do so.
Assuming users accept those ads. Like, would they make it clear with a "sponsored section", or would they just try to worm it into the output? I could see a lot of potential ways that users reject the ad service, especially if it's seen to compromise the utility or correctness of the output.
Billions of people use Google, YouTube, Facebook, Tiktok, Instagram, etc and accept the ads. Getting similar ad rates would make OpenAI fabulously profitable. They have no need to start with ad formats that might be rejected by users. Even if that were the intended endgame, you'd want to boil the frog for years.
(Author here). Yes I am aware of that and did mention it. However - what I wanted to push back in this article was that claude code was completely unsustainable and therefore a flash in the pan and devs aren't at risk (I know you are not saying this).
The models as is are still hugely useful, even if no further training was done.
> The models as is are still hugely useful, even if no further training was done.
Exactly. The parent comment has an incorrect understanding of what unit economics means.
The cost of training is not a factor in the marginal cost of each inference or each new customer.
It’s unfortunate this comment thread is the highest upvoted right now when it’s based on a basic misunderstanding of unit economics.
The marginal cost is not the salient factor when the model has to be frequently retrained at great cost. Even if the marginal cost was driven to zero, would they profit?
But they don't have to be retained frequently at great cost. Right now they are retrained frequently because everyone is frequently coming out with new models and nobody wants to fall behind. But if investment for AI were to dry up everyone would stop throwing so much money at R&D, and if everyone else isn't investing in new models you don't have to either. The models are powerful as they are, most of the knowledge in them isn't going to rapidly obsolete, and where that is a concern you can paper over it with RAG or MCP servers. If everyone runs out of money for R&D at the same time we could easily cut back to a situation where we get an updated version of the same model every 3 years instead of a bigger/better model twice a year.
And whether companies can survive in that scenario depends almost entirely on their unit economics of inference, ignoring current R&D costs
Like we've seen with Karparthy & Murati starting their own labs, it's to be expected that over the next 5 years, hundreds of engineers & researchers at the bleeding edge will quit and start competing products. They'll reliably raise $1b to $5b in weeks, too. And it's logical: for an investor, a startup founded by a Tier 1 researcher will more reliably 10-100x your capital, vs. Anthropic & OpenAI that are already at >$250b+.
This talent diffusion guarantees that OpenAI and Anthropic will have to keep sinking in ever more money to stay at the bleeding edge, or upstarts like DeepSeek and incumbents like Meta will simply outspend you/hire away all the Tier 1 talent to upstage you.
The only companies that'll reliably print money off AI are TSMC and NVIDIA because they'll get paid either way. They're selling shovels and even if the gold rush ends up being a bust, they'll still do very well.
True. But at some point the fact that there are many many players in the market will start to diminish the valuation of each of those players, don’t you think? I wonder what that point would be.
> But if investment for AI were to dry up everyone would stop throwing so much money at R&D, and if everyone else isn't investing in new models you don't have to either
IF.
If you do stagnate for years someone will eventually decide to invest and beat you. Intel has proven so.
Yeah so? How does that change anything?
Unit economics are the salient factor of inference costs, which this article is about.
I upvoted it because it aligns most closely with my own perspective. I have a strong dislike for AI and everything associated with it, so my judgment is shaped by that bias. If a post sounds realistic or complex, I have no interest in examining its nuance. I am not concerned with practical reality and prefer to accept it without thinking, so I support ideas that match my personal viewpoint.
I don’t understand why people like you have to call this stuff out? Like most of HN thinks the way I do and that’s why the post was upvoted. Why be a contrarian? There’s really no point.
Is this written by a sarcastic AI?
> claude code was completely unsustainable and therefore a flash in the pan and devs aren't at risk
How can you possibly say this if you know anything about the evolution of costs in the past year?
Inference costs are going down constantly, and as models get better they make less mistakes which means less cycles = less inference to actually subsidize.
This is without even looking at potential fundamental improvements in LLMs and AI in general. And with all the trillions in funding going into this sector, you can't possibly think we're anywhere near the technological peak.
Speaking as a founder managing multiple companies: Claude Code's value is in the thousands per month /per person/ (with the proper training). This isn't a flash in the pan, this isn't even a "prediction" - the game HAS changed and anyone telling you it hasn't is trying to cover their head with highly volatile sand.
I totally agree with you! I have heard others saying this though. But I don't think it's true.
Got it — I got confused by your wording in the post but it’s clear now.
I think the point isn't to argue AI companies are money printers or even that they're fairly valued, it's that at least the unit economics work out. Contrast this to something like moviepass, where they were actually losing money on each subscriber. Sure, a company that requires huge capital investments that might never be paid back isn't great either, but at least it's better than moviepass.
Unit economics needs to include the cost of the thing being sold, not just the direct cost of selling it.
Unit economics is mostly a manufacturing concept and the only reason it looks OK here is because of not really factoring in the cost of building the thing into the cost of the thing.
Someone might say I don’t understand “unit economics” but I’d simply argue applying a unit economics argument saying it’s good without including the cost of model training is abusing the concept of unit economics in a way that’s not realistic from a business/economics sense.
The model is what’s being sold. You can’t just sell “inference” as a thing with no model. Thats just selling compute, which should be high margin. The article is simply affirming that by saying yes when you’re just selling compute in micro-chunks that’s a decent margin business which is a nice analysis but not surprising.
The cost of “manufacturing” an AI response is the inference cost, which this article covers.
> That would be like saying the unit economics of selling software is good because the only cost is some bandwidth and credit card processing fees. You need to include the cost of making the software
Unit economics is about the incremental value and costs of each additional customer.
You do not amortize the cost of software into the unit economics calculations. You only include the incremental costs of additional customers.
> just like you need to include the cost of making the models.
The cost of making the models is important overall, but it’s not included in the unit economics or when calculating the cost of inference.
That isn't what unit economics is. The purpose of unit economics is to answer: "How much money do I make (or lose) if I add one more customer or transaction?". Since adding an additional user/transaction doesn't increase the cost of training the models you would not include the cost of training the models in a unit economics analysis. The entire point of unit economics is that it excludes such "fixed costs".
There is no marginal cost for training, just like there's no marginal cost for software. This is why you don't generally use unit economics for analyzing software company breakeven.
The only reason unit economics aren't generally used for software companies is the profit margin is typically 80%+. The cost of posting a Tweet on Twitter/X is close to $0.
Compare the cost of tweeting to the cost of submitting a question to ChatGPT. The fact that ChatGPT rate limits (and now sells additional credits to keep using it after you hit the limit) indicates there are serious unit economic considerations.
We can't think of OpenAI/Anthropic as software businesses. At least from a financial perspective, it's more similar to a company selling compute (e.g. AWS) than a company selling software (e.g. Twitter/X).
You can amortise the training cost across billions of inference requests though. It's the marginal cost for inference that's most interesting here.
The thing about large fixed costs is that you can just solve them with growth. If they were losing money on inference alone no amount of growth would help. It's not clear to me there's enough growth that everybody makes it out of this AI boom alive, but at least some companies are going to be able to grow their way to profitability at some point, presumably.
But what about running Deepseek R1 or (insert other open weights model here)? There is no training cost for that.
1. Someone is still paying for that cost.
2. “Open source” is great but then it’s just a commodity. It would be very hard to build a sustainable business purely on the back of commoditized models. Adding a feature to an actual product that does something else though? Sure.
There is plenty of money to be made from hosting open source software. AWS for instance makes tons of money from Linux, MySQL, Postgres, Redis, hosting AI models like DeepSeek (Bedrock) etc.
> You have to include all the costs associated with the model, not just inference.
The title of the article directly says “on inference”. It’s not a mistake to exclude training costs. This is about incremental costs of inference.
Hacker News commenters just can't help but critique things even when they're missing the point
The parent commenter’s responses are all based on a wrong understanding of what unit economics means.
You don’t include fixed costs in the unit economics. Unit economics is about incremental costs.
I know I'm agreeing with you. I'm saying, don't bother with him lol
Your comment may apply to the original commenter “missing” the point of TFA and to the person replying “missing” the point of that comment. And to my comment “missing” the point of yours - which may have also “missed” the point.
I’ve clearly “missed” the point you were trying to make, because there’s nothing complicated: The article is about unit economics and marginal costs of inferences and this comment thread is trying to criticize the article based on a misunderstanding of what unit economics means.
I was not trying to make any point. I’m not even sure if the comment I replied to was suggesting that it was you or the other commenter who was missing some point or another.
It’s fun to work backwards, but i was listening to a podcast where the journalists were talking about a dinner that Sam Altman had.
This question came up and Sam said they were profitable if you exclude training and the COO corrected him
So at least for OpenAI, the answer is “no”
They did say it was close
And that’s if you exclude training costs which is kind of absurd because it’s not like you can stop training
Worth noting that the post only claims they should be profitable for the inference of their paying customers on a guesstimated typical workload. Free users and users with atypical usage patterns will obviously skew the whole picture. So the argument in the post is at least compatible with them still losing money on inference overall.
Excluding training two of their biggest costs will be payroll and inferencing for all the free users.
It’s therefore interesting that they claimed it was close: this supports the theory inferencing from paid users is a (big) money maker if it’s close to covering all the free usage and their payroll costs?
There’s no mention of that in this article about it:
https://archive.is/wZslL
They quote him as saying inference is profitable and end it at that.
Are you saying that the COO corrected him at the dinner, or on the podcast? Which podcast was it?
From a journalist at the dinner:
“I think that tends to end poorly because as demand for your service grows, you lose more and more money. Sam Altman actually addressed this at dinner. He was asked basically, are you guys losing money every time someone uses ChatGPT?
And it was funny. At first, he answered, no, we would be profitable if not for training new models. Essentially, if you take away all the stuff, all the money we're spending on building new models and just look at the cost of serving the existing models, we are sort of profitable on that basis.
And then he looked at Brad Lightcap, who is the COO, and he sort of said, right? And Brad kind of like squirmed in his seat a little bit and was like, well, we're pretty close.
We're pretty close. We're pretty close.
So to me, that suggests that there is still some, maybe small negative unit economics on the usage of ChatGPT. Now, I don't know whether that's true for other AI companies, but I think at some point, you do have to fix that because as we've seen for companies like Uber, like MoviePass, like all these other sort of classic examples of companies that were artificially subsidizing the cost of the thing that they were providing to consumers, that is not a recipe for long-term success.”
From Hard Fork: Is This an A.I. Bubble? + Meta’s Missing Morals + TikTok Shock Slop, Aug 22, 2025
GPT-5 was I suppose their attempt to make a product that provides as good metrics as their earlier products.
Uber doesn't really compare, as they had existing competition from taxi companies that they first had to/have to destroy. And cars or fuel didn't get 10x cheaper over the time of Uber's existence, but I'm sure that they still can optimize a lot for efficiency.
I'm more worried about OpenAIs capability to build a good moat. Right now it seems that each success is replicated by the competing companies quickly. Each month there is a new leader in the benchmarks. Maybe the moat will be the data in the end, i.e. there is barriers nowadays to crawl many websites that have lots of text. Meanwhile they might make agreements with the established AI players, maybe some of those agreements will be exclusive. Not just for training but also for updating wrt world news.
Thanks!
"This article is like saying an apartment complex isn’t “losing money” because the monthly rents cover operating costs but ignoring the cost of the building. Most real estate developments go bust because the developers can’t pay the mortgage payment, not because they’re negative on operating costs."
Exactly the analogy I was going to make. :)
It’s funny you mention apartments, because that is exactly the comparison i thought of, but with the opposite conclusion. If you buy an apartment with debt, but get positive cash flow from rent, you wouldn’t call that unprofitable or a bad investment. It takes X years to recoup the initial debt, and as long as X is achievable that’s a good deal.
Hoping for something net profitable including fixed costs from day 1 is a nice fantasy, but that’s not how any business works or even how consumers think about debt. Restaurants get SBA financing. Homeowners are “net losing money” for 30 years if you include their debt, but they rightly understand that you need to pay a large fixed cost to get positive cash flow.
R&D is conceptually very similar. Customer acquisition also behaves that way
Running with your analogy having positive cash flow and buying a property to hold for the long term makes sense. Thats the classic mortgage scenario. But it takes time for that math to work out. Buying a new property every 6 months breaks that model. That’s like folks that keep buying a new car and rolling “negative equity” into a new deal. It’s insanity financially but folks still do it.
I found Dario’s explanation pretty compelling:
https://x.com/FinHubIQ/status/1960540489876410404
the short of it: if you do the accounting on a per-model basis, it looks much better
That was worth a watch, thank you!
I don't think it's an accounting error when the article title says "Are OpenAI and Anthropic Really Losing Money on Inference?"
And it's a relevant question because people constantly say these companies are losing money on inference.
I think the nuance here is what people consider the “cost” of “inference.” Purely on compute costs and not accounting for the cost of the model (which is where the article focuses) it’s not bad.
Their assumption is that training is a fixed cost: you'll spend the same amount on training for 5 users as you will with 500 million users.
Spending hundreds of millions of dollars on training when you are two guys in a garage is quite significant, but the same amount is absolutely trivial if you are planet-scale.
The big question is: how will training cost develop? Best-case scenario is a one-and-done run. But we're now seeing an arms race between the various AI providers: worst-case scenario, can the market survive an exponential increase in training costs for sublinear improvements?
They just won’t train it. They have the choice.
Why do you think they will mindlessly train extremely complicated models if the numbers don’t make sense?
Because they are trying to capture the market, obviously.
Nobody is going to pay the same price for a significantly worse model. If your competitor brings out a better model at the same price point, you either a) drop your price to attract a new low-budget market, b) train a better model to retain the same high-budget market, or c) lose all your customers.
You have taken on a huge amount of VC money, and those investors aren't going to accept options A or C. What is left is option B: burn more money, build an even better model, and hope your finances last longer than the competition.
It's the classic VC-backed startup model: operate at a loss until you have killed the competition, then slowly increase prices as your customers are unable to switch to an alternative. It worked great for Uber & friends.
> If the cash flow was truly healthy these companies wouldn’t need to raise money.
If this were true, the stock market would have no reason to exist.
My observation is that Opus is chronically capacity constrained while being dramatically more expensive than any of the others.
To me that more or less settles both "which one is best" and "is it subsidized".
Can't be sure, but anything else defies economic gravity.
Or Opus is a great model so demand is high and the provider isn't scaling the platform. I agree something defies gravity.
Also that's not accounting for free riders.
I have probably consumed trillions of free tokens from openai infra since gpt 3 and never spent a penny.
And now I'm doing the equivalent on Gemini since flash is free of charge and a better model than most free of charge models.
I think this is missing the point that the very interesting article makes.
You're arguing that maybe the big companies won't recoup their investment in the models, or profitably train new ones.
But that's a separate question. Whether a model - which now exists! - can profitably be run is very good to know. The fact that people happily pay more than the inference costs means what we have now is sustainable. Maybe Anthropic of OpenAI will go out of business or something, but the weights have been calculated already, so someone will be able to offer that service going forward.
It hasn't even proven that, it's assuming a ridiculous daily usage, and also ignoring free riders. Running a model is likely not profitable for any provider right now. Even a public company (e.g alphabet) isn't obliged to honest figures since numbers on the sheets can be moved left and right. We won't know for a other year or two when companies we have today start falling and their founders start talking.
> if you have healthy positive cash flow you have much better mechanisms available to fund capital investment other than selling shares. Eg issue a bond against that healthy cash flow.
Is that actually true in 2025? Presumably you have to make coupon payments on a bond(?), but shares are free. Companies like Meta have shown you can issue shares that don't come with voting rights and people will buy them, and meme stocks like GME have demonstrated the effectiveness of churning out as many shares as the market will bear.
Agree it’s not the fashionable thing. There’s a line from The Big Short of “This is Wall Street Dr Bury, if you offer us free money we’re going to take it.”
These companies are behaving the same way. Folks are willing to throw endless money into the present pit so on the one hand I can’t blame them for taking it.
Reality is though that when the hype wears off it’s only throwing more gasoline on the fire and building a bigger pool of investors that’s will become increasingly desperate to salvage returns. History says time and time again that story doesn’t end well and that’s why the voices mumbling “bubble” under their breath are getting louder every day.
What will be the knock on effect on us consumers?
Self hosting LLMs isn’t completely out of the realm of feasibility. Hardware cost may be 2-3x a hardcore gaming rig but it would be neat to see open source, self hosted, coding helpers. When Linux hit the scenes it put UNIX(ish) power in the hands of anyone with no license fee required. Surely somewhere someone is doing the same with LLM assisted coding.
The only reason to have a local model right now is for privacy and hobby.
The economics are awful and local model performance is pretty lackluster by comparison. Never mind much slower and narrower context length.
$6,000 is 2.5 years of a $200/mo subscription. And in 2.5 years that $6k setup will likely be equivalent to a $1k setup of the time.
We don't even need to compare it to the most expensive subscriptions.
The $20 subscription is far more capable than anything i could build locally for under $10k.
Costs will go up to levels where people will no longer find this stuff as useful/interesting. It’s all fun and games until the subsides end.
See the recent reactions to AWS pricing on Kiro where folks had a big WTF reaction on pricing after, it appears, AWS tried to charge realistic pricing based on what this stuff actually costs.
Isn’t AWS always quite expensive? Look at their margins and the amount of cash it throws off, versus the consumer/retail business which runs a ton more revenue but no profit.
If you’re applying the same pricing structure to Kiro as to all AWS products then, yeah, it’s not particularly hobbyist accessible?
The article is answering a specific question, and has excluded this on purpose. If you have a sunk training cost you still want to know if you can at least operate profitably.
API prices are going up and rate limits are getting more aggressive (see what's going on with cursor and claude code)
The model is like a house. It can be upgraded. And it can be sold.
Think of the model as an investment.
> Think of the model as an investment.
Exactly, or a factory.