How good is it for coding, relative to recent frontier models like GPT 5.x, Sonnet 4.x, etc?

My experience so far- much less reliable. Though it’s been in chat not opencode or antigravity etc. you give it a program and say change it in this way, and it just throws stuff away, changes unrelated stuff etc. completely different quality than pro (or sonnet 4.5 / GPT-5.2)

Been thinking of having Opus generate plans and then having Gemini 3 Flash execute. Might be better than using Haiku for the same.

Anyone tried something similar already?

So why Flash is so high in LiveCodeBench Pro?

BTW: I have the same impression, Claude was working better for me for coding tasks.

In my own, very anecdotal, experience, Gemini 3 Pro and Flash are both more reliably accurate than GPT 5.x.

I have not worked with Sonnet enough to give an opinion there.