at what point do model providers optimize for the "pelican riding a bicycle" test so they place well on Simon's influential benchmark? :-)

They almost certainly are, even if unknowingly, because HN and all blogs get piped continuously into all models' training corpus.

See https://simonwillison.net/2025/Nov/13/training-for-pelicans-...

Why is the assumption that they trained for a pelican on a bicycle, rather than running RL for all kinds of 'generate an SVG' tasks?

Gemini did exactly that, and boasted about it at launch: https://x.com/JeffDean/status/2024525132266688757

That post doesn't say anything about training for SVG generation

https://blog.google/innovation-and-ai/models-and-research/ge...

> Code-based animation: 3.1 Pro can generate website-ready, animated SVGs directly from a text prompt. Because these are built in pure code rather than pixels, they remain crisp at any scale and maintain incredibly small file sizes compared to traditional video.