> ... on my Macbook Max M5 128 GB

Local development for who? How many of y'all are rocking 128GB of memory? Am I reading Apple's site correctly that it's a $10,000 laptop?

You don't need nearly that much RAM to run Qwen 3.6 27B, though. qwen3.6:27b-q4_K_M is only 17GB, for example.

This is what I run on an M5 MacBook Air 32GB. Works great.

I’m not having it build whole features from scratch, though. I give it pretty explicit instructions closer to the class or function level, and it still saves me an immense amount of time, while I’m very connected to the code that’s written.

Definitely the sweet spot for me.

A 27B model can fit easily on a 32GB VRAM card (e.g. 5090) or a 32GB computer in RAM at FP8/Q8 (unsloth have 28.6GB Q8 files).

For 24GB VRAM cards (e.g. 4090) you can use Q6_K (22.5GB) or Q5_K_M (19.5GB) quants, possibly offloading some of the weights to RAM.

For the 35B model, ofloading to RAM doesn't slow it down much. If you have a nice CPU and a weak GPU, it will be fast enough to use.

I'm on 128GB ram strix halo, bought framework desktop for a few thousand CAD back when everyone was calling framework desktop overpriced

It wasn't $10k a month ago

I work with a lot of 3D graphics and geo stuff so I can hit the ceiling with my 48 GB mac. It's not all LLM work. I prioritized more storage than RAM with my budget. Being able to run local llms has greatly helped me understand how they work. For day to day dev I pay for Gemini or Claude.

Think commercial. My company invested in a local rig since privacy is important to our customers and sometimes I want to use these models on private data.

Even in that case it would make more sense to put the hardware in a server rack shared with everyone rather than inside macbooks.

At any rate it makes a stolen backpack or spilled drink a lot less damaging.

Obviously the rig is not a macbook but indeed a server rack. I'm just saying that we're using this model for local development.

Qwen3.6 runs great on GPU with 24GB VRAM. You could get used 3090 for it.

Certainly won't work on my M4 Pro with 24GB lol

I’m using it on a 48GB machine and it causes some lag, so it might be worse on 24, but it should run.

Unsloth recommends 18GB of RAM for Qwen3.6-27B (for their version of the model).

https://unsloth.ai/docs/models/qwen3.6

I feel you!

Sent from my 8gb M2 Mac mini.

I'm still rocking my nvidia 2060, which I had purchased for $400 at the time.

I struggle to imagine purchasing multiple 1k+ cards on my own dime.