Yes. The models are good, the models are fast, and the internal tooling has caught up at this point too. There's a lot of UI/UX/tooling stuff that's still being worked through, integrations with VCS, and solving deeper problems that I probably can't talk about, but I'd say the frustrations of most are about the rate of change much more than the actual abilities.
One thing that's interesting is a bunch of internal thought leaders who swear by the Flash models over the Pro models. Whether this is true or not doesn't really matter, the interesting bit to me is that we are at a point with the models where "better" models are not necessarily more useful, and that faster with more work on the harnesses may be a better trade-off.
Happy to chat internally if you want, feel free to reach out.
I see a lot of people swearing by one model, but without trying others. I see a lot of opinions based on a snapshot of tooling from ~January, when for example Claude Code was exceptional, but that don't appear to have been updated. In blind tests the models appear to be much closer than some folks would have you believe.
If you mean specifically the Gemini VS Code Extension: it's terrible compared to Claude Code or Codex. I don't know how they can get away with it. Just constant timeouts, weird failure modes, have to start a new chat to switch modes... but I don't think any of that is specific to gemini the model- it seems to be the extension.
As for actual solutions to problems ignoring the VS Code extension aspect, I find all three premiere models to be excellent coding agents for my purposes.
The overall quality of LLM coding tools is shockingly bad. I haven't found a single one without major issues, and many have the same problems reappear every few months, sometimes bad enough to almost break the entire thing (e.g. 100% failure rate in editing files, broken for weeks, with the same cause each time, multiple times in a year).
Note that coding is not the only use of Gemini or any of these models. It's also not what this article is talking about. Gemini can be not the best coding agent, but very good at other things.
> He says the problem is that they can't use Claude Code because it's the enemy, and Gemini has never been good enough to capture people's workflows like Claude has, so basically agentic coding just never really took off inside Google. They're all just plodding along, completely oblivious to what's happening out there right now.
This is a bunch of gabagoo. Wrong on so many layers, it's not even worth reading further.
a) goog has agentic coding in both antigravity & cli forms. While it is not at the level of cc + opus, it's still decent.
b) goog has their own versions of models trained on internal code
c) goog has claude in vertex, and most definitely can set it up in secure zones (like they can for their clients) so they'd be able to use claude (at cost) within their own projects.
Antigravity's workflow is so slow and buggy with permissions, it is unuseable compared to cc/codex. the only part that is nice is that it allows usage of Opus.
Gemini CLI is an absolute joke. I dont know if its the harness, or the Gemini models' poor instruction following, but more than 50% of my sessions gemini turns insane and ends up in thought loops. In another 25% it does far more updating than I asked or is reasonable.
This is why next to no-one talks about either of them. Does Antigravity's agent manager even work yet without crashing and showing zero conversations? In typical google fashion they released AG and appear to have then set it on autopilot with a skeleton crew of less than 1 developer. Clear issues have not been fixed since day 1. Some settings just do not work. Permissions are not respected.
I’m not so sure. From talking to some of my own friends at google they feel that antigravity/gemini models are handicapping them and would much rather be using claude code (which only deepmind gets to use)
Google’s businesses are very broad and durable. But Google being the only company in the world without access (except for GDM+labs) to a competent coding agent will take a toll.
We’ll see how long Google can hold out hoping for GDM to create something that is competitive.
I’m guess that within 6 months Google will give up on coding and finally let their devs use Claude/Codex.
This isn’t a security problem, this is a GDM issue with GDM’s promises being far beyond their ability.
> But Google being the only company in the world without access (except for GDM+labs) to a competent coding agent will take a toll.
I doubt it. I use Gemini CLI daily because Gemini is what work pays for, and I have a personal Claude account. The difference is not that great, especially if you're not doing full vibe-coding. It's unlikely to have the kind of effect you're describing.
I for one can't tell the difference between Claude and Gemini for coding. And the internal agent tooling is many times faster than Claude Code in my experience.
Lie? Gemini CLI is unuseable. The IF of gemini models is atrocious. Honestly, how often does your gemini CLI go insane in thought loops and you have to stop it?
Yes. The models are good, the models are fast, and the internal tooling has caught up at this point too. There's a lot of UI/UX/tooling stuff that's still being worked through, integrations with VCS, and solving deeper problems that I probably can't talk about, but I'd say the frustrations of most are about the rate of change much more than the actual abilities.
One thing that's interesting is a bunch of internal thought leaders who swear by the Flash models over the Pro models. Whether this is true or not doesn't really matter, the interesting bit to me is that we are at a point with the models where "better" models are not necessarily more useful, and that faster with more work on the harnesses may be a better trade-off.
> a bunch of internal thought leaders who swear by the Flash models over the Pro models
I'm coming around on this too. deepseek-v4-flash is impressive.
>One thing that's interesting is a bunch of internal thought leaders who swear by the Flash models over the Pro models.
I've seen people outside Google favoring flash Gemini models over the Pro.
There are also some benchmarks where flash models have higher scores, so yes, apparently speed does matter.
You’re absolutely kidding yourself if you genuinely believe that.
Happy to chat internally if you want, feel free to reach out.
I see a lot of people swearing by one model, but without trying others. I see a lot of opinions based on a snapshot of tooling from ~January, when for example Claude Code was exceptional, but that don't appear to have been updated. In blind tests the models appear to be much closer than some folks would have you believe.
I’ll admit it swings back and forth on a six month cycle or so; however, cost-to-output matters.
Also, for niche use-cases there are clear winners.
If you mean specifically the Gemini VS Code Extension: it's terrible compared to Claude Code or Codex. I don't know how they can get away with it. Just constant timeouts, weird failure modes, have to start a new chat to switch modes... but I don't think any of that is specific to gemini the model- it seems to be the extension.
As for actual solutions to problems ignoring the VS Code extension aspect, I find all three premiere models to be excellent coding agents for my purposes.
The overall quality of LLM coding tools is shockingly bad. I haven't found a single one without major issues, and many have the same problems reappear every few months, sometimes bad enough to almost break the entire thing (e.g. 100% failure rate in editing files, broken for weeks, with the same cause each time, multiple times in a year).
I'd say I'm surprised by it, but uh
>The overall quality of LLM coding tools is shockingly bad
Most of them were vibecoded in days, so what do you expect? And new versions just add features, they never fix the old cruft.
Probably there would be some money to be made if someone actually takes the time to write a good agent harness.
Note that coding is not the only use of Gemini or any of these models. It's also not what this article is talking about. Gemini can be not the best coding agent, but very good at other things.
Last month, Steve Yegge suggested that they are not: https://xcancel.com/Steve_Yegge/status/2043747998740689171
> He says the problem is that they can't use Claude Code because it's the enemy, and Gemini has never been good enough to capture people's workflows like Claude has, so basically agentic coding just never really took off inside Google. They're all just plodding along, completely oblivious to what's happening out there right now.
This is a bunch of gabagoo. Wrong on so many layers, it's not even worth reading further.
a) goog has agentic coding in both antigravity & cli forms. While it is not at the level of cc + opus, it's still decent.
b) goog has their own versions of models trained on internal code
c) goog has claude in vertex, and most definitely can set it up in secure zones (like they can for their clients) so they'd be able to use claude (at cost) within their own projects.
Antigravity's workflow is so slow and buggy with permissions, it is unuseable compared to cc/codex. the only part that is nice is that it allows usage of Opus.
Gemini CLI is an absolute joke. I dont know if its the harness, or the Gemini models' poor instruction following, but more than 50% of my sessions gemini turns insane and ends up in thought loops. In another 25% it does far more updating than I asked or is reasonable.
This is why next to no-one talks about either of them. Does Antigravity's agent manager even work yet without crashing and showing zero conversations? In typical google fashion they released AG and appear to have then set it on autopilot with a skeleton crew of less than 1 developer. Clear issues have not been fixed since day 1. Some settings just do not work. Permissions are not respected.
Agreed, however imo there is def some problems unique to Google which is making the internal experience less than ideal.
Hoping they can figure it out sooner rather than later.
Demis Hassabis chimed in on that thread and called it what it is: clickbait.
I’m not so sure. From talking to some of my own friends at google they feel that antigravity/gemini models are handicapping them and would much rather be using claude code (which only deepmind gets to use)
Sure, but there's cavernous distance between "google = john deere" and "darn I have to use Gemini"
He was entirely correct.
He made a follow up after the pushback by GDM.
Google’s businesses are very broad and durable. But Google being the only company in the world without access (except for GDM+labs) to a competent coding agent will take a toll.
We’ll see how long Google can hold out hoping for GDM to create something that is competitive.
I’m guess that within 6 months Google will give up on coding and finally let their devs use Claude/Codex.
This isn’t a security problem, this is a GDM issue with GDM’s promises being far beyond their ability.
> But Google being the only company in the world without access (except for GDM+labs) to a competent coding agent will take a toll.
I doubt it. I use Gemini CLI daily because Gemini is what work pays for, and I have a personal Claude account. The difference is not that great, especially if you're not doing full vibe-coding. It's unlikely to have the kind of effect you're describing.
There is value in the "eating your own dog food".
If internal staff aren't happy with the tools they build, typically that should drive improvements to their own tools
This couldn't be further from the truth
The point of dogfooding is exactly that: if we're unhappy, we're the ones to improve.
the engineers using gemini have no control over deepmind
Are you in the Gemini team?
I for one can't tell the difference between Claude and Gemini for coding. And the internal agent tooling is many times faster than Claude Code in my experience.
Lie? Gemini CLI is unuseable. The IF of gemini models is atrocious. Honestly, how often does your gemini CLI go insane in thought loops and you have to stop it?
I use Jetski.
they use a web based vscode like (cider) with a custom agent
Antigravity comes to mind
they use claude code at deepmind
Codex?
Not a Googler, but I use gemini in JetBrains Junie and have no issues with it. It's cheap, very fast and most importantly actually listens to you.