Why not use these as a benchmark for LLM ability to make breakthrough discoveries?
For example prompt the 1913 model to try and “Invent a new theory of gravity that doesn’t conflict with special relativity”
Would it be able to eventually get to GR? If not, could finding out why not illuminate important weaknesses.