Understanding is and always has been the "hard" bottleneck. In programming work, if one drops understanding and eg let's an agent write code with only superficial human review or none at all, I believe that they can easily get 100x fast or more, the main question being whether the process collapses some point due to sloppy code. In research fields like mathematics, skipping understanding is not something that can be done without a radical reconstruction of what mathematics (as a process/activity/field) is.

It sounds plausible that LLMs help generate insights that humans have missed. But there are many open questions, eg the rate of generating insightful vs uninsightful but plausible statements, which can affect how useful they will be, and of course "open"ai has no incentive to share how much effort/cost (tokens and/or human-review) had been put into investigating erdos problems before coming up with this solution.