Most likely OpenAI has models at least as efficient as DeepSeek or Qwen. Cerebras offers both GPT-OSS-120B and Qwen3-235B-Instruct. Obviously, the second has twice as many parameters as the first, but that's the closest comparison I can find. The Qwen model is twice as large, but twice as slow (1400 tokens/second vs 3000) and 50% more expensive ($1.2 per million tokens vs $0.75). Now, OpenAI is running a proprietary model, and most likely it is much optimized than the free version they release for public use.
Inference is not the main cost driver, training and research is.
I'm not sure that's still the case. It used to be the case, but I doubt it continues to be. OpenAI had $6.7 BN costs for the first half of 2025. I doubt they spent $3 BN in training and research. They have 700 million weekly users, and many of these users are really heavy users. Just taking myself: I probably consumed a few million tokens with GPT-5-Codex in the last 3 days alone. I am a heavy user, but I think there are users who burn through hundreds of times more tokens than me.
Absolutely not true.