Hacker News

"AI in the datacenter" and "AI on local consumer hardware" will eventually be two separate niches with entirely different capabilities, at least if scaling laws continue unchanged and there's no near-term inherent limit to AI smarts. The real point of the datacenter is to be able to do datacenter-scale things. But you don't need that kind of vast compute to run even the largest open models today: on prem hardware can do it easily especially if you're OK with a somewhat delayed response.