But are we talking pure LLMs, or existing AI solvers augmented with LLMs? Because while the latter is impressive, it doesn't state much outside of this specific domain.

If anything, I see greater verticality of specialized software that is using LLMs at their core, but with much aid and technology around it to really make the most out of it.

The announcement says:

> This was solved by GPT-5.4 Pro (prompted by Price)

See the discussion here: https://www.erdosproblems.com/forum/thread/1196

"are we talking pure LLMs, or existing AI solvers augmented with LLM"

Why do these distinctions matter?

is it an LLM, or symbolic, or a combo, or a dozen technologies stitched together. Who cares. It is all automation. It is all artificial.

It matters in terms of whether it’s generalizable. We had computers do impressive specific things since decades.

True, it's an achievement either way. But if an "out of the box LLM" can solve difficult math problems it is an achievement by the LLM vendor. Otherwise it is an achievement by the people doing the vertical integration.

In the context of evolving LLM this is the crucial distinction.

The distinctions matter since computational proofs have been around for decades.