Hacker News

it's fundamentally how these things work; learning token distribution given prior context. The expected output over time is the mean value of that distribution. Regression to the mean is the danger I'm talking about.