There is a push from multiple directions at the same time:
- new AI desktops with GB10s. They are relatively cheap and you can cluster them and load 1TB of VRAM
- Nvidia, amd, intel, Cerebras etc pushing new hardware
- oss models getting crazy good, like glm 5.2
- flash models getting very good like deepseek V4 flash
- quantizations
- harnesses being able to use different models (big for difficult stuff, small for grunt work)
So hopefully soon for the ones who want to break free from APIs, we will be able to host at home a cluster of AI desktops at a reasonable price with Opus-level capabilities, can't wait!!
I feel like "relatively" is doing a lot of work, there: at about $4k per GB10, that's $36k for a 1TB cluster. Cheap compared to equivalent H200's, but out of reach for home labs that aren't funded with OpenAI or Anthropic RSUs.
When the AI bubble pops those hardware prices will pop too.
My hope is on Intel Crescent Island with 480GB. I don't need 8x H200 performance (and cost), but I would like to run GLM 5.2 Q8.
I'd love to too, but I guess Crescent Island with 480 GB will cost something like $10-12k or even more.
Hope you're right! Can't wait!