> I expect language models to also get crazy good at mathematical theorem proving
Indeed, systems like AlphaProof / AlphaGeometry are already able to win a silver medal at the IMO, and the former relies on Lean for theorem verification [1]. On the open source side, I really like the ideas in LeanDojo [2], which use a form of RAG to assist the LLM with premise selection.
[1] https://deepmind.google/discover/blog/ai-solves-imo-problems...