> a model that runs on my own machine will never have the capacity of a model that runs in a datacenter.

I don’t think so. A local run model only needs to serve one or a few people. It seems possible to run a DeepSeek v4 model at full capacity on a server costing 200k usd. Very expensive but not impossible.

Factor in hardware and software improvements over time, and the fact that most people may just need to run a smaller and quantized model, it should take a pc at 10k usd scale.