Hacker News

Just because humans are usually tested in a particular way that allows them to make up for a lack of generality with an outstanding performance in their specialization doesn't mean that is a good way to test generalization itself.

Apparently someone here doesn't know how outliers affect a mean. Or, for that matter, have any clue about the purpose of the ARC-AGI benchmark.

For anyone who is interested in critical thinking, this paper describes the original motivation behind the ARC benchmarks:

https://arxiv.org/abs/1911.01547

famouswaffles 14 hours ago [ - ]

>Apparently someone here doesn't know how outliers affect a mean.

If the concern is that easy questions distort the mean, then the obvious fix is to reduce the proportion of easy questions, not to invent a convoluted scoring method to compensate for them after the fact. Standardized testing has dealt with this issue for a long time, and there’s a reason most systems do not handle it the way ARC-AGI 3 does. Francois is not smarter than all those people, and certainly neither are you.

This shouldn't be hard to understand.