Hacker News

It was a completely useless test even before the labs trained for it.

Yes, it's always been published as a joke. You've explained why it was (and still is) funny meta-commentary on AI benchmarks.