Gemini seems to be pretty awful as agentic coding. It always finish the task, and when I see the result, it just breaks my code.
Not sure the fault it's "doing bad code", I guess it's just not being good at being agentic. Saw this on Gemini CLI and other tools.
GLM, Kimi, Qwen-Code all behaves better for me.
Probably Gemini 3 will fix this, as Gemini 2.5 Pro it's "old" by now.
Gemini CLI is bad, model itself is really good.