I was looking into this for LLMs but it's clearly a graphics-processing focused card. The memory bandwidth is too low for that much RAM to be useful in an LLM context. The 5090 I have has the same amount of RAM but far more bandwidth and that makes it much more useful.
Compared to a B70, a 5090 is 1x the memory with 3x the bandwidth at 4x the price. Yeah, the 5090 is better, but you're paying for it.
On actual market it’s $1100 vs $3200 now, right? I actually got mine at $2200 at cost in the before days.
Current lowest price for a new card on Newegg: $949.99 vs $3,699.99.
Wow, 5090 prices have exploded. Thanks for looking. I should have known by hardware price intuition is broken.
Oh wow, I really would've expected higher memory bandwidth. That's only ~2-3x the little DGX Spark-alike I have to play with. Would've expected more.
> it's clearly a graphics-processing focused card.
Yes, that's what the G in GPU stands for. It's great to see that there are still manufacturers that understand this.
It’s 32gb for people who can’t go for scalped 5090s but have a 3090 budget.
I have a pair of them with a 9480 and the only thing I have to do is keep the cache happy.
Eh. Trading CUDA for 8 more gigs seems like bad deal, unless you know absolutely for certain what you want to run will run on it.
Until NVidia prices get better, I’ll build out with the Intel stack and keep the cache (and prompt processing speeds) happy.
As for software, anything that has a SYCL or Vulkan backend, and/or can be Intel optimized (especially to the same degree as llama.cpp) can run well.
[dead]