Hacker News

teruakohatu 14 hours ago [ - ]

The pelican is really getting old as an a standalone evaluation metric. By now they are certainly going to be in training set if not explicitly tuned to produce it for the press on HN alone.

Keep the pelican but isn’t it time to add something else more novel that all current and past models struggle with?

whywhywhywhy an hour ago [ - ]

One shot canvas and svg images or animations are also just something that at this scale shouldn't be an issue at all, even Qwen running locally on 24gb cards can do impressive ones.

Don't understand why this test gets any attention, I mean other than the pelicans which isn't a good test, theres no meat in this article.

justinclift 13 hours ago [ - ]

Relevant: https://news.ycombinator.com/item?id=47839493

caseyf7 8 hours ago [ - ]

It also seems like all of the models have converged on very similar images.