To be fair, there's not much to report. Isn't it pretty much at 0?
Opus-4.6 with 0.5% currently leads GPT-5.4 with 0.2%[1].
Seems meaningful even if the absolute numbers are very low. That's sort of the excitement of it.
2. https://arcprize.org/leaderboard
Opus-4.6 with 0.5% currently leads GPT-5.4 with 0.2%[1].
Seems meaningful even if the absolute numbers are very low. That's sort of the excitement of it.
2. https://arcprize.org/leaderboard