Hacker News

You’re pre-supposing that we can actually afford to just keep throwing more compute at the problem.

Moores law is long dead, leading edge nodes are getting ever more expensive, the most recent generation of tensor silicon is not significantly better in terms of flops/watt over the previous generation.

Given that model performance has consistently trended log linear with compute thrown at the problem, there must be a point at which it is no longer economically viable to throw more flops at the problem.