Hacker News

I wonder if we’re reaching a point of diminishing returns with training, at least, just by scaling the data set. I mean, there’s a finite amount of information (that can be obtained reasonably) to be trained on. I think we’re already at a sizable chunk of that, not to mention the cost of naively scaling up. My guess is that the ultimate winner will be the one that figures out how to improve without massive training costs, through better algorithms, or maybe even just better hardware (i.e. neuristors). I mean, we know that at worst case, we should be able to build something with human level intelligence that takes about 20 watts to run, and is about the size of a human head, and you only need to ingest a small slice of all available information to do that. And training should only use about 3.5 MWh, total, and can be done with the same hardware that runs the model.