Hacker News

Yes, I have an AMD Ryzen AI Max+ chip with memory set to allocate 96 gigs to the GPU and 32 gigs to the CPU. I got it last week, and I've been running gpt-oss-120b at q5 at 40t/s. I run Linux with llama.cpp compiled against ROCm 7.