Hacker News

flatline 17 hours ago [ - ]

I don't think anyone has a firm grasp on actual inference costs -- including the research and training that has gone into those models. We've got near-frontier capabilities from open source models from China at pennies on the dollar compared to US big tech rollouts. OpenAI and Anthropic are heavily subsidizing their inference -- no wait, they are charging the most they can get away with before going public. Where is the truth?

schaefer 16 hours ago [ - ]

> I don't think anyone has a firm grasp on actual inference costs.

There are huge numbers of users (myself included) that do have an exact idea of what inference costs are - on open models. We can buy tokens from 3rd parties that have no motivation to subsidize our use. That's to say, there's a fair marketplace[1] and we're hanging out there.

If you want to say "I don't think anyone has a firm grasp on actual inference costs on these proprietary/closed models", then I could agree with that.

[1]: https://openrouter.ai/rankings#leaderboard

andrewmutz 17 hours ago [ - ]

Both can be true. They can be charging what the market will bear, and still be charging less than their costs of running it.

wyre 16 hours ago [ - ]

There is no way I'm believing DeepSeek can charge less than $1 USD for their pro model while Opus costs over 25x more, yet their price is less than the cost of running it?

kube-system 15 hours ago [ - ]

It would seem strange, if they were operating in the same economy, but they don't. DeepSeek operates in an economy with a high degree of central planning.

China subsidizes strategic industries, and they have heavily done so with AI. And DeepSeek specifically has said they have no commercialization plans.

For example: https://www.boc.cn/aboutboc/bi1/202501/t20250123_25254674.ht...

wyrdcurt 11 hours ago [ - ]

DeepSeek is not the only provider of inference for their models. Chinese subsidies likely do explain DeepSeek's ability to provide inference cheaper than other providers, but even a US provider like DeepInfra can serve DeepSeek 4 Pro at $1.30/M in and $2.60/M out. Unless American labs are doing something wildly inefficient, it feels safe to assume Anthropic has some profit margin on inference at API prices.

kube-system 11 hours ago [ - ]

They may, neglecting overhead R&D. But also, some suspect that US models are significantly heavier than DeepSeek in resource consumption by multiple measures

It’s generally established that Anthropic/OpenAI are going for all out performance with big VC dollars at the expense of efficiency and China has geopolitically limited compute and an inventive to compete on value per dollar.

12 hours ago [ - ]

[deleted]

re-thc 15 hours ago [ - ]

> There is no way I'm believing DeepSeek can charge less

Why not? Hetzner charges WAY less than AWS too. Can you not believe that?

orangecat 9 hours ago [ - ]

That's the point. Hetzner is presumably covering their costs, so it's a safe bet that AWS is profitable.

dontlikeyoueith 16 hours ago [ - ]

> OpenAI and Anthropic are heavily subsidizing their inference -- no wait, they are charging the most they can get away with before going public. Where is the truth?

Both. They are charging the most they can get away with and that amount is still heavily subsidized by VC capital.

InsideOutSanta 16 hours ago [ - ]

> I don't think anyone has a firm grasp on actual inference costs -- including the research and training that has gone into those models

We know roughly how much these companies spend and what their revenues are. Based on that, they'd have to more than double revenue (without spending more money) just to stay even, and that's not good enough given how deep in the hole they are.

> OpenAI and Anthropic are heavily subsidizing their inference -- no wait, they are charging the most they can get away with before going public. Where is the truth?

Both are true. I mean, I'd be willing to spend a bit more than I do now, but not more than double, and neither are most companies. The company I work for is currently investigating how to reduce LLM spend, not looking to spend more.

pimeys 16 hours ago [ - ]

We pay by token at work. I just finished one session with Opus that was 4000 dollars. In about three days.

Now that 200USD subscription starts to feel cheap...

zozbot234 16 hours ago [ - ]

That would be about ~300 tok/s over 72 hours at Claude Fable output token prices? I'm not sure that this passes a sanity test.

unholiness 16 hours ago [ - ]

Subagents are a helluva drug.

esafak 15 hours ago [ - ]

That's the price of several engineers!

rubyn00bie 16 hours ago [ - ]

Just outta curiosity, as I’ve never gotten a spend anywhere near that, what variant were you using? Like max context window and fast mode? Or was it just chugging along non stop for three days?

pimeys 16 hours ago [ - ]

Fast mode max content window. The task was: replace all 1600+ queries from one database to another and make the whole integration test pass. We did multiple passes, with different concerns when changing from database to another. My OpenCode session right now says $4,365.02.

I haven't gotten close to this either before, but now we wanted to move fast because this branch gets conflicts all the time and we want to get over with the migration asap.

rglullis 15 hours ago [ - ]

It's a bit of a left field question, but I am curious: Let's say that if the company wasn't paying the whole bill but only subsidizing it - e.g, if it paid 90% of the $4000. What would you do?

pimeys 14 hours ago [ - ]

I don't know, why would I pay to do my job? It's not my first database switch for a startup. Only this time it doesn't take two months of grueling work. I know exactly how this is done, but the amount of grunt programming and testing and repetitive work is just not great. And it's not a task that brings new customers or a new product. Just a mandatory and annoying thing to deal with when we are growing.

And don't get me wrong. Opus did an absolutely horrible job at first, second and third round in this task. You really needed to steer it to get to the right solution.

And now Fable is out. And its first round of code reviews for this huge PR was definitely worth the money too...

Don't think that I'm just shrugging to that number. I see it every day, and I don't like that it's in the thousands now. But for people paying the 100 or 200 dollar plans, I'm not super sure if you will be able to use them in the future if the token price is in the thousands for a bit bigger task...

If I'd pay this from my own pocket, I'd definitely go with DeepSeek or local models and figure it out how to make the best use of them.

rglullis 13 hours ago [ - ]

> If I'd pay this from my own pocket, I'd definitely go with DeepSeek or local models and figure it out how to make the best use of them.

IOW, you don't really think the value of this work is really worth $4k.

> why would I pay to do my job?

The question is: how long do you think that you employer will be willing to pay for you and Anthropic, if you yourself said if it were your money you'd put some time and effort to work with an open model?

pimeys 13 hours ago [ - ]

> The question is: how long do you think that you employer will be willing to pay for you and Anthropic, if you yourself said if it were your money you'd put some time and effort to work with an open model?

I wonder what this question really means? Anthropic is useless if you don't know what to do with it. It's very useful if you do, and you can guide it to do the right things. Yes, it will for sure reduce the amount of people we need to hire. But we are always looking for hires who know what they do and can utilize agents to be faster.

But if you think about how long employer is willing to pay 10-20k per month per seat for Anthropic? I can't see this to be feasible and it will have to end at some point.

rglullis 12 hours ago [ - ]

Regardless of the actual value produced by the models, if I am the CTO of any company that has the budget to spend $10k/month/seat on Claude, I'd take 5%-10% of that to build an alternative in-house.

pimeys 3 hours ago [ - ]

I'm with you here. We can't slide into a situation where you put a sizable amount of your budget for an American mega corporation if you want to survive in the competition. We need local models and we need them to be good enough to help us.

internet101010 9 hours ago [ - ]

Indefinitely for these big mundane grunk jobs. In every scenario it is going to be cheaper and faster than lobbing it to Infosys.

logicchains 16 hours ago [ - ]

We have a firm grasp on actual inference costs from the various open weights model providers on OpenRouter. They don't have the money to subsidize inference and it's quite a competitive market, so the prices are representative of the costs.

MichaelMedbed 17 hours ago [ - ]

[flagged]

kllrnohj 17 hours ago [ - ]

regardless of whether that's true or not, US companies doing hosted inference of the models coming out of China are also significantly cheaper than those from OpenAI or Anthropic

polski-g 17 hours ago [ - ]

Not relevant to the post.