300 years is a stretch. But Legendre described linear regression ~220 years ago (1805). And from a very high level perspective, modern neural networks are mostly just stacks of linear regression layers with non-linearities sandwiched between them. I'm obviously oversimplifying it a lot, but that't the gist of it.
Maybe it wasn't there originally, but now there's a footnote:
> 1. The 300-year timeframe refers to the foundational mathematical principles underlying modern bias-variance analysis, not the contemporary terminology. Bayes' theorem (1763) established the mathematical framework for updating beliefs with evidence, whilst Laplace's early work on statistical inference (1780s-1810s) formalised the principle that models must balance fit with simplicity to avoid spurious conclusions. These early statistical insights—that overly complex explanations tend to capture noise rather than signal—form the mathematical bedrock of what we now call the bias-variance tradeoff. The specific modern formulation emerged over several decades in the latter 20th century, but the core principle has governed statistical reasoning for centuries.↩
300 years is a stretch. But Legendre described linear regression ~220 years ago (1805). And from a very high level perspective, modern neural networks are mostly just stacks of linear regression layers with non-linearities sandwiched between them. I'm obviously oversimplifying it a lot, but that't the gist of it.
"For over 300 years, one principle governed every learning system: the bias-variance tradeoff."
The bias-variance tradeoff is a very old concept in statistics (but not sure how old, might very well be 300)
Anyway note the first algorithms realted to neural networks are older then the digital computer by a decade at least.
Maybe it wasn't there originally, but now there's a footnote:
> 1. The 300-year timeframe refers to the foundational mathematical principles underlying modern bias-variance analysis, not the contemporary terminology. Bayes' theorem (1763) established the mathematical framework for updating beliefs with evidence, whilst Laplace's early work on statistical inference (1780s-1810s) formalised the principle that models must balance fit with simplicity to avoid spurious conclusions. These early statistical insights—that overly complex explanations tend to capture noise rather than signal—form the mathematical bedrock of what we now call the bias-variance tradeoff. The specific modern formulation emerged over several decades in the latter 20th century, but the core principle has governed statistical reasoning for centuries.↩