Yeah, I wouldn’t complain if one dropped in my lap, but they’re not at the top of my list for inference hardware.

Although... Is it possible to pair a fast GPU with one? Right now my inference setup for large MoE LLMs has shared experts in system memory, with KV cache and dense parts on a GPU, and a Spark would do a better job of handling the experts than my PC, if only it could talk to a fast GPU.

[edit] Oof, I forgot these have only 128GB of RAM. I take it all back, I still don’t find them compelling.