I don't really consider that a great benchmark anyway and we really need better ones that are objective instead of these mostly performative and cheatable and also available in the training set.
> Pelican for Fable 5 on default settings is a clear improvement on Opus 4.8
And doesn't contain any actual criticism within the comment (your blog post might, but just referring to what was posted on HN, which is a bit booster-y on its own).
The entire pelican benchmark is a joke. The joke is that, for all of the billions of dollars poured into these things and the claims of PhD level intelligence, they still draw pelicans not-much-better than a five year-old would.
I don't spell that joke out in every comment I post here because that wouldn't be very funny.
I think it's a clever thing he did to basically guarantee he continues to get major traffic to his blog here every time a model is released, especially since he's taking sponsorships with a static banner at the top of every page now. I think he's trying to go the Daring Fireball route.
You can't tell someone to "get a life" while taking the effort to create a burner account for the sole purpose of insulting someone.
I don't really consider that a great benchmark anyway and we really need better ones that are objective instead of these mostly performative and cheatable and also available in the training set.
Simon's pelicans are an institution. Are you trying to get banned. Lmao.
For me it is like if crypto bros were allowed to shill their DAOs and tokens during the crypto/NFT phase.
He is the only person not getting rate-limited for shilling AI all the time.
Pointing out how much the models still suck at drawing pelicans is a funny way to shill them.
Tbf the first line of your first comment is:
And doesn't contain any actual criticism within the comment (your blog post might, but just referring to what was posted on HN, which is a bit booster-y on its own).The entire pelican benchmark is a joke. The joke is that, for all of the billions of dollars poured into these things and the claims of PhD level intelligence, they still draw pelicans not-much-better than a five year-old would.
I don't spell that joke out in every comment I post here because that wouldn't be very funny.
I think it's a clever thing he did to basically guarantee he continues to get major traffic to his blog here every time a model is released, especially since he's taking sponsorships with a static banner at the top of every page now. I think he's trying to go the Daring Fireball route.