How are they not SOTA? They're all very similar with ChatGPT being the worst (for my use case anyway). Like adding lambdas and random c++ function calls into my vulkan shaders.
How are they not SOTA? They're all very similar with ChatGPT being the worst (for my use case anyway). Like adding lambdas and random c++ function calls into my vulkan shaders.
Gemini 2.5 Pro is the most capable for my usecase in Pytorch as well. Large context and much better instruction following for code edits make a big difference.
Gemini 2.5 pro is generally non-competitive with GPT-5-medium or Sonnet 4.5.
But never fear, Gemini 3.0 is rumored to be coming out Tuesday.
The random people tweets I've seen said Oct 9th which is Thursday. I suppose we will know when we know.
based on what? LLM benchmarks are all bullshit, so this is based on... your gut?
Gemini outputs what I want with a similar regularity as the other bots.
I'm so tired of the religious thinking around these models. show me a measurement.
> LLM benchmarks are all bullshit
> show me a measurement
Your comment encapsulates why we have religious thinking around models.
Please tell me this comment is a joke.