One of the claims it asks LLMs to grade is "Artificial intelligence will cause widespread job loss among software engineers."
Yea man this benchmark is really really bad.
One of the claims it asks LLMs to grade is "Artificial intelligence will cause widespread job loss among software engineers."
Yea man this benchmark is really really bad.