These are very serious research level math questions. They are not “Erdős style” questions; they look more like problems or lemmas that I encountered while doing my PhD. Things that don’t make it into the papers but were part of an interesting diversion along the way.

It seems likely that PhD students in the subfields of the authors are capable of solving these problems. What makes them interesting is that they seem to require fairly high research level context to really make progress.

It’s a test of whether the LLMs can really synthesize results from knowledge that require a human several years of postgraduate preparation in a specific research area.

So these are like those problems that are “left for the reader”?

Not necessarily. Even the statements may not appear in the final paper. The questions arose during research, and understanding them was needed for the authors to progress, but maybe not needed for the goal in mind.

Very serious for mathematicians - not for ML researchers.

If the paper would not have had the AI spin, would those 10 questions still have been interesting?

It seems to me that we have here a paper that is solely interesting because of the AI spin -- while at the same time this AI spin is really poorly executed from the point of AI research, where this should be a blog post at most, not an arXiv preprint.

The timed-reveal aspect is also interesting.

How is that interesting for a scientific point of view? This seems more like a social experiment dressed as science.

Science should be about reproducibility, and almost nothing here is reproducible.