Until NVidia prices get better, I’ll build out with the Intel stack and keep the cache (and prompt processing speeds) happy.

As for software, anything that has a SYCL or Vulkan backend, and/or can be Intel optimized (especially to the same degree as llama.cpp) can run well.