Can't you run small LLMs on like... a Macbook air M1? Some models are under 1B weights, they will be almost useless but I imagine you could run them on anything from the last 10 years.
But yeah if you wanna run 600B+ weights models your gonna need an insane setup to run it locally.
I run qwen models on MBA M4 16 Gb and MBP M2 Max 32 Gb, MBA is able to handle models in accordance with its vram memory capacity (with external cooling), e.g. qwen3 embedding 8B (not 1B!) but inference is 4x-6x times slower than on mbp. I suspect weaker SoC
Anyway, Apple SoC in M series is a huge leverage thanks to shared memory: VRAM size == RAM size so if you buy M chip with 128+ Gb memory, you’re pretty much able to run SOTA models locally, and price is significantly lower than AI GPU cards
They "run" in the most technical sense, yes. But they're unusably slow.