I could see a future in which the major AI labs run a local LLM to offload much of the computational effort currently undertaken in the cloud, leaving the heavy lifting to cloud-hosted models and the easier stuff for local inference.
I could see a future in which the major AI labs run a local LLM to offload much of the computational effort currently undertaken in the cloud, leaving the heavy lifting to cloud-hosted models and the easier stuff for local inference.
wouldnt that be counter to their whole business model?
I don't think so. Acquiring hardware for inference is a chokepoint on growth. If they can offload some inference to the customer's machine, that allows them to use more of their online capacity to generate money.