They run fairly well for me on my 128GB Framework Desktop.
what do you run this on if I may ask? lmstudio, ollama, lama? which cli?
I run Qwen3-Coder-Next (Qwen3-Coder-Next-UD-Q4_K_XL) on the Framework ITX board (Max+ 395 - 128GB) custom build. Avg. eval at 200-300 t/s and output at 35-40 t/s running with llama.cpp using rocm. Prefer Claude Code for cli.
Can't speak for parent, but I've had decent luck with llama.cpp on my triple Ryzen AI Pro 9700 XTs.
what do you run this on if I may ask? lmstudio, ollama, lama? which cli?
I run Qwen3-Coder-Next (Qwen3-Coder-Next-UD-Q4_K_XL) on the Framework ITX board (Max+ 395 - 128GB) custom build. Avg. eval at 200-300 t/s and output at 35-40 t/s running with llama.cpp using rocm. Prefer Claude Code for cli.
Can't speak for parent, but I've had decent luck with llama.cpp on my triple Ryzen AI Pro 9700 XTs.