I spent 10 minutes with it in their new "agy" CLI tool and immediately found it is nowhere close to GPT 5.5 high in codex. It was sloppy and made poor assumptions in its analysis. It would have produced a mess if I let it go ahead with its plan. And it was just like previous versions of Gemini with poor tool use (e.g. "I wrote a file with the plan..." but file was never written.)

For reference, this is a Rust codebase, deep "systems" stuff (database, compiler, virtual machine / language runtime)

They're still months behind OpenAI and Anthropic on coding.

Mind you I also find Claude Code careless and unreliable these days, too. (But it's good at tool use at least).

I do use Gemini for "lifestyle" AI usage (web research etc) tho.