Yeah, there is a lot of advantage to having this machine because the CUDA stack is still king. My Two AMD GPUs are suffering when it comes to working with ROCm stack. I have forks of Ollama and VLLM that took many weekends to figure out.
Yeah, there is a lot of advantage to having this machine because the CUDA stack is still king. My Two AMD GPUs are suffering when it comes to working with ROCm stack. I have forks of Ollama and VLLM that took many weekends to figure out.
If you're on Strix Halo, check out Donato's prebuilt toolboxes for ROCm with RADV or Vulkan:
https://github.com/kyuz0/amd-strix-halo-toolboxes
It takes all the work out of it, you just start llama-server in the container context and you're off doing inference without having to figure out dependencies.
Oh yeah he is doing great things. Not on Strix myself but his dual AMD AI Pro r9700 build ironically is the same machine I built.