You could always run your own server locally if you have a decent gpu. Some of the smaller LLMs are getting pretty good.

Also M-series Macs have an insane price/performance/electricity consumption ratio in LLM use-cases.

Any M-series Mac Mini can run a pretty good local model with usable speed. The high-end models easily compete with dedicated GPUs.

Correct. My dusty Intel Nuc is able to run a decent 3B model(thanks to ollama) with fans spinning but does not affect any other running applications. It ks very useful for local hobby projects. Visible lags and freezes begin if I start a 5B+ model locally.

Yes - of course. That's been my experience with "ultimate" privacy.