My strix halo board is feeling more useful and less toylike with the recent performance gains combined from MTP, better quantization, and generalized performance improvements across the stack. For example, I can run Unsloth's Gemma4-31B 4-bit QAT model with around 30tg and 200pp. I don't find that to be too slow at all. Particularly because it's nearly full accuracy and good enough for a lot of different stuff I throw at it.

I think it also helps that I'm using my machine to do home server stuff. It excels at all of the traditional workloads. Then I can lean on the AI to help with automation here and there. I find it deeply satisfying.

you can absolutely use it for some workloads, but as soon as you have some extra complexity for a big repo it'll take forever and the economics are so silly to the point that the electricity bill would be comparable to a subscription. I love having the possibility of running things locally if some random dude decide to pull them plug, and give me solice the fact that i can have 100% private inference, but as the main driver during the day? shoot me