"AI in the datacenter" and "AI on local consumer hardware" will eventually be two separate niches with entirely different capabilities, at least if scaling laws continue unchanged and there's no near-term inherent limit to AI smarts. The real point of the datacenter is to be able to do datacenter-scale things. But you don't need that kind of vast compute to run even the largest open models today: on prem hardware can do it easily especially if you're OK with a somewhat delayed response.