Running dual Pro B60 on Debian stable mostly for AI coding.

I was initially confused what packages were needed (backports kernel + ubuntu kobuk team ppa worksforme). After getting that right I'm now running vllm mostly without issues (though I don't run it 24/7).

At first had major issues with model quality but the vllm xpu guys fixed it fast.

Software capability not as good as nvidia yet (i.e. no fp8 kv cache support last I checked) but with this price difference I don't care. I can basically run a small fp8 local model with almost 100k token context and that's what I wanted.

> small fp8 local model with almost 100k token context

Would not fit Qwen3.5 27B would it? That's the SOTA