Most likely OpenAI has models at least as efficient as DeepSeek or Qwen. Cerebras offers both GPT-OSS-120B and Qwen3-235B-Instruct. Obviously, the second has twice as many parameters as the first, but that's the closest comparison I can find. The Qwen model is twice as large, but twice as slow (1400 tokens/second vs 3000) and 50% more expensive ($1.2 per million tokens vs $0.75). Now, OpenAI is running a proprietary model, and most likely it is much optimized than the free version they release for public use.

[1] https://inference-docs.cerebras.ai/models/overview

Inference is not the main cost driver, training and research is.

I'm not sure that's still the case. It used to be the case, but I doubt it continues to be. OpenAI had $6.7 BN costs for the first half of 2025. I doubt they spent $3 BN in training and research. They have 700 million weekly users, and many of these users are really heavy users. Just taking myself: I probably consumed a few million tokens with GPT-5-Codex in the last 3 days alone. I am a heavy user, but I think there are users who burn through hundreds of times more tokens than me.

Absolutely not true.