Hacker News

Surprised Gemini 3.1 Pro beat Claude in your evals for code-gen. Any intuition why - spatial reasoning, or just cleaner OpenSCAD output?

Gemini 3.1 whilst not the best agentic coding model, has extremely strong vision (which makes it reason spatially very well).

Fable 5 was top for a brief moment, whilst it was around!