I actually think you should give it a spin. IMO you don't need claude level performance for a lot of day to day tasks. Qwen3 8B, or even 4B quantized is actually quite good. Take a look at it. You can offload to the GPU as well so it should really help with speed. Theres a setting for it
> Qwen3 8B, or even 4B quantized is actually quite good.
No, it’s not.
Trust me, I don't write this from a position of vague hand waving.
Ive tried a lot of self hosted models at a lot of sizes; those small models are not good enough, and do not have a context long enough to be useful for most everyday operations.