Why does OpenAI need so much more compute than everybody else? DeepSeek, Qwen and many others build competitive models that need much less capital.
Why does OpenAI need so much more compute than everybody else? DeepSeek, Qwen and many others build competitive models that need much less capital.
Chinese companies need to pay much higher prices for the same GPUs, so they would need to charge more to make a profit, but it's difficult to charge more unless they have a much better product. So building massive data centers to gain market share is riskier for them.
That said, Alibaba not releasing the weights for Qwen3-Max and announcing $53 billion in AI infrastructure spending https://www.reuters.com/world/china/alibaba-launches-qwen3-m... suggests that they think they're now at a point where it makes sense to scale up. (The Reuters article mentions data centers in several countries, which I assume also helps work around high GPU prices in China.)
Circling back to OpenAI: I don't think they're spending so much on infrastructure just because they want to train bigger models on more data, but moreso because they want to serve those bigger models to more customers using their services more intensively.
Most likely OpenAI has models at least as efficient as DeepSeek or Qwen. Cerebras offers both GPT-OSS-120B and Qwen3-235B-Instruct. Obviously, the second has twice as many parameters as the first, but that's the closest comparison I can find. The Qwen model is twice as large, but twice as slow (1400 tokens/second vs 3000) and 50% more expensive ($1.2 per million tokens vs $0.75). Now, OpenAI is running a proprietary model, and most likely it is much optimized than the free version they release for public use.
[1] https://inference-docs.cerebras.ai/models/overview
Inference is not the main cost driver, training and research is.
I'm not sure that's still the case. It used to be the case, but I doubt it continues to be. OpenAI had $6.7 BN costs for the first half of 2025. I doubt they spent $3 BN in training and research. They have 700 million weekly users, and many of these users are really heavy users. Just taking myself: I probably consumed a few million tokens with GPT-5-Codex in the last 3 days alone. I am a heavy user, but I think there are users who burn through hundreds of times more tokens than me.
Absolutely not true.
Because they have more distribution than anyone else? Pretty much everyone uses chat.com but almost no-body uses deepseek.
my company runs on deepseek stack. we might move to qwen
Your parents might have heard of chatgpt not deepseek or qwen.
my father has deepseek on his phone
good for you, genuinely - but you have to accept that's a minority position?
They dont, your question is just as wrong as asking, "Why does LockheedMartin need so many material science engineers? Chengdu Aircraft makes fighters with 10x fewer"
They’re trying to lock out competition from accessing compute.