> So maybe the AI labs have been paying attention after all!

> I think this mainly demonstrates that the pelican on the bicycle has firmly exceeded its limits as a useful benchmark.

As acknowledged in the article.

Gemini 3.1 basically takes it home on that benchmark, anyway, it's done.