i’m running m4 pro 48gb right now
omlx + gemma 12b 6 bit + pi
it’s feasible for sure
MoEs for speed (qwen 35b, cohere 30b, gemma 26b)
Dense for more methodical work (qwen 27b [reigning champ], gemma 31b, gemma 12b)
MoE i recommend 5bit+
Dense i think 4 bit is okay
Play with your context size, you don’t really need that much, have lazy loading for tools and mcps
my pi extensions for anyone looking for a skinny quick setup, i have use `--no-skills` right now too:
"npm:pi-codex-goal",
"npm:pi-simplify",
"npm:pi-mcp-adapter",
"git:github.com/elpapi42/pi-minimal-subagent",
"npm:@wierdbytes/pi-statusline",
"npm:@aliou/pi-guardrails",
"npm:pi-lens",
"npm:@juicesharp/rpiv-todo",
"npm:pi-hashline-readmap",
"npm:@mrclrchtr/supi-review",
"npm:pi-cmux",
"npm:@mrclrchtr/supi-context",
"npm:pi-tool-search"
think of local models as "zero sugar" models and that's where we're at right now. I think it's crazy how good these models are compared to last year's frontier models