Math is a contrived system though, there are no fundamental laws of nature that require math to be done the way we do it.

A human society may develop their own math in a base 13 system, or an entirely different way of representing the same concepts. When they can't solve our base 10 math problems in a way that matches how we expect does that mean they are parrots?

Part of the problem here is that we still have yet to land on a clear, standard definition of intelligence that most people agree with. We could look to IQ, and all of its problems, but then we should be giving LLMs an IQ test to answer rather than a math test.

The fact that much of physics can be so elegantly described by math suggests the structures of our math could be quite universal, at least in our universe.

Check out the problems in the MATH dataset, especially Level 5 problems. They are fairly advanced (by most people’s standards) and most are not dependent on which N in the base-N system used to solve them. The answers would be different of course but the structures of the problems and solutions remain largely intact.

Website for tracking IQ measurements of LLMs:

https://www.trackingai.org/

The best one already scores higher than all but the top 10-20% of most populations.