This just shows that Google needs to double down on its AI models fast. Even open source chinese models are beating 3.1 Pro and 3.5.Flash in almost everything.
This just shows that Google needs to double down on its AI models fast. Even open source chinese models are beating 3.1 Pro and 3.5.Flash in almost everything.
Gemma 4 beat Gemini 3.1 Pro, as well. In a later replication test I haven't published yet, it found more bugs than all other models (somewhat inconsistently) when given multiple attempts. So, it seems like they are doing real work but seemingly on making models efficient rather than making them bigger. Gemma 4 12b is the most effective vision model I've tested, including models several times its size.
Google said they would bring 3.5 Pro this month. I've been waiting for a month now.