Hacker News

famouswaffles 14 hours ago [ - ]

Humans and LLMs are not seeing the benchmark in the same format. What's made up about that ? Can you solve this in the JSON format ?

Look man, don't reply if you don't want to.