There is only so much we can compress information, especially the spatiotemporal information that leads to breakthroughs in mathematics.

Still doesn't mean we can't have 99% of the benefits of current 10T models in a 1B+search