Hacker News

pants2 19 hours ago [ - ]

Labs still aren't publishing ARC-AGI-3 scores, even though it's been out for some time. Is it because the numbers are too embarrassing?

tedsanders 14 hours ago [ - ]

Honest answer is that it isn't done running yet. It takes some human bandwidth and time to run, so results weren't ready by this morning. We don't know what the score will be, but will probably go up on the leaderboard sometime soon. I personally don't put a lot of stock in the ARC-AGI evals, as it's not relevant to most work that people do, but should still be interesting to see as a measure of reasoning ability.

(I work at OpenAI.)

AG25 18 hours ago [ - ]

GPT-5.5 was just released and OpenAI didnt mention ARC AGI 3 at all, their score probably sucks.

kilroy123 19 hours ago [ - ]

To be fair, there's not much to report. Isn't it pretty much at 0?

pants2 17 hours ago [ - ]

Opus-4.6 with 0.5% currently leads GPT-5.4 with 0.2%[1].

Seems meaningful even if the absolute numbers are very low. That's sort of the excitement of it.

2. https://arcprize.org/leaderboard