Hacker News

saberience 14 hours ago [ - ]

It's a useless meaningless benchmark though, it just got a catchy name, as in, if the models solve this it means they have "AGI", which is clearly rubbish.

Arc-AGI score isn't correlated with anything useful.

Legend2440 12 hours ago [ - ]

It's correlated with the ability to solve logic puzzles.

It's also interesting because it's very very hard for base LLMs, even if you try to "cheat" by training on millions of ARC-like problems. Reasoning LLMs show genuine improvement on this type of problem.

jabedude 14 hours ago [ - ]

how would we actually objectively measure a model to see if it is AGI if not with benchmarks like arc-AGI?

WarmWash 13 hours ago [ - ]

Give it a prompt like

>can u make the progm for helps that with what in need for shpping good cheap products that will display them on screen and have me let the best one to get so that i can quickly hav it at home

And get back an automatic coupon code app like the user actually wanted.

HDThoreaun 10 hours ago [ - ]

ARC-AGI 2 is an IQ test. IQ tests have been shown over and over to have predictive power in humans. People who score well on them tend to be more successful

fsh 9 hours ago [ - ]

IQ tests only work if the participants haven't trained for them. If they do similar tests a few times in a row, scores increase a lot. Current LLMs are hyper-optimized for the particular types of puzzles contained in popular "benchmarks".