Why would you think that deepseek is more efficient than gpt-5/Claude 4 though? There's been enough time to integrate the lessons from deepseek.
Why would you think that deepseek is more efficient than gpt-5/Claude 4 though? There's been enough time to integrate the lessons from deepseek.
Because to make GPT-5 or Claude better than previous models, you need to do more reasoning which burns a lot more tokens. So, your per-token costs may drop, but you may also need a lot more tokens.
GPT-5 can be configured extensively. Is there any point at which any configuration of GPT-5 that offers ~DeepSeek level performance is more expensive than DeepSeek per token?