I think that the future of local LLMs is delegation. You give it a prompt and it very quickly identifies what should be used to solve the prompt.

Can it be solved locally with locally running MCPs? Or maybe it's a system API - like reading your calendar or checking your email. Otherwise it identifies the best cloud model and sends the prompt there.

Basically Siri if it was good

I completely disagree. I don't see the current status quo fundamentally changing.

That idea makes so much sense on paper, but until you start implementing it that you realized why no one does it (including Siri). "Some tasks are complex and better suited for complex giant model, but small models are perfectly capable of running simple limited task" makes a ton of sense, but the component best equipped at evaluating that decision is the smarter component of your system. At which point, you might as well have had it run the task.

It's like assigning the intern to triage your work items.

When actually implementing the application with that approach, every time you encounter an "AI-miss" you would (understandably) blame the small model, and eventually give up and delegate yet-another-scnario to the cloud model.

Eventually you feel you're artificially handcuffing yourself compared to literally every body else trying to ship something utilizing a 1b model. You have the worst of all options, crappy model with lots of hiccups yet it's still (by far) the most resource intensive part of your application making the whole thing super heavy and you are delegating more and more to the cloud model.

The local LLM scenario is going to be entirely driven by privacy concerns (around which there is no option. It's not like an E2EE LLM API could exist) or cost concerns if you believe you can run it cheaper.

Doesn’t this ignore that some data may be privileged/too big to send to the cloud. Perhaps i have my health records in Apple Health and Kaiser Permanente. You can imagine it being okay to be accessed locally, but not sent up to the cloud

I’m confused. Your Apple Health or Kaiser Permanente data is already stored on the cloud. It’s not like it’s only ever store locally and if you lost your phone you lost your Apple Health or Kaiser Permanente data.

I already mentioned privacy being the only real concern, but it won’t be really the end user privacy. At least that particular concern isn’t the ball mover people’s comments here would make you think it is. Plenty of people are storing their medical information in Google drives and Gmail attachments already. If end user privacy from “the cloud” was actually a thing, you would have seen that reflected in the market.

The privacy concerns that are of importance are that of organizations.