It's possible but unlikely given the short timeline, diverse questions that require multiple matheamticians, and low stakes. Also they've already run preliminary tests.
It's possible but unlikely given the short timeline, diverse questions that require multiple matheamticians, and low stakes. Also they've already run preliminary tests.
> It's possible but unlikely given the short timeline
Yep. "possible but unlikely" was my take too. As another person commented, this isn't really a benchmark, and as long as that's clear, it seems fair. My only fear is that some submissions may be AI-assisted rather than fully AI-generated, with crucial insights coming from experienced mathematicians. That's still a real achievement even if it's human + AI collaboration. But I fear that the nuance would be lost on news media and they'll publish news about the dawn of fully autonomous math reasoning.