Hacker News

ali_chherawalla 2 months ago [ - ]

I actually think you should give it a spin. IMO you don't need claude level performance for a lot of day to day tasks. Qwen3 8B, or even 4B quantized is actually quite good. Take a look at it. You can offload to the GPU as well so it should really help with speed. Theres a setting for it

noodletheworld 2 months ago [ - ]

> Qwen3 8B, or even 4B quantized is actually quite good.

No, it’s not.

Trust me, I don't write this from a position of vague hand waving.

Ive tried a lot of self hosted models at a lot of sizes; those small models are not good enough, and do not have a context long enough to be useful for most everyday operations.