With these being 1M context size, does that all but confirm that Quasar Alpha and Optimus Alpha were cloaked OpenAI models on OpenRouter?
With these being 1M context size, does that all but confirm that Quasar Alpha and Optimus Alpha were cloaked OpenAI models on OpenRouter?
Yes, confirmed by citing Aider benchmarks: https://openai.com/index/gpt-4-1/
Which means that these models are _absolutely_ not SOTA, and Gemini 2.5 pro is much better, and Sonnet is better, and even R1 is better.
Sorry Sam, you are losing the game.
Aren’t all of these reasoning models?
Won’t the reasoning models of openAI benchmarked against these be a test of if Sam is losing?
Sonnet 3.7 non-reasoning is better on its own. In fact even Sonnet 3.5-v2 is, and that was released 6 months ago. Now to be fair, they're close enough that there will be usecases - especially non-coding - where 4.1 beats it consistently. Also, 4.1 is quite a lot cheaper and faster. Still, OpenAI is clearly behind.
There is no OpenAI model better than R1, reasoning or not (as confirmed by the same Aider benchmark; non-coding tests are less objective, but I think it still holds).
With Gemini (current SOTA) and Sonnet (great potential, but tends to overengineer/overdo things) it is debatable, they are probably better than R1 (and all OpenAI models by extension).
Even without reasoning, isn't Deepseek V3 from March better?
Yes, OpenRouter confirmed it here - https://x.com/OpenRouterAI/status/1911833662464864452
I think Quasar is fairly confirmed [0] to be OpenAI.
[0] https://x.com/OpenAI/status/1911782243640754634