Ya, i dont know of anyone wanting to run very large AI models in a windows environment. Or, frankly, on a laptop. Why not just VPN into a dedicated server?

I do. I can take my laptop anywhere I want, for example to a coffee shop and run a coding model while eating a croissant without worrying about an internet connection, as the term local model implies.

And you can warm up the croissant by just placing it on the trackpad while you wait for the LLM to finish

The coffee shop doesn't have wifi?

It's not very good and SSH is blocked.

I always use a VPN (to my own home server) for this reason (and other reasons) when connecting to public WiFi.

With BUILD happening tomorrow, I suspect Microsoft is going to have some stuff about local AI there with MS Foundry on Windows/Foundry Local. The timing of this announcement a day before BUILD is obviously intentional.

Suddenly all the Windows K2 stuff makes sense, but I doubt it'll be enough. Its too little too late for Microsoft.

How much does a dedicated server with 128GB vram cost a month.

How well will the local LLM run when your laptop is in your bag while you're walking around?

You can get an H200 (141GB) here for $2,700/mo: https://deploybase.ai/articles/h200-price

I could be wrong but my understanding is that 24/7 dedicated servers are wildly economically unviable. The reason cloud tends to cost less than local today (other than the subsidization) is because you aren't running models 24/7. So like 6 hours of cloud per weekday might beat the yearly cost of building local machines, but it's not in the same universe if you're running 24/7, as evidenced by two months of H200 rental costing more than the DGX Spark this Laptop is built out of.

I mean not that much? You buy the hardware once and then it’s just running for many years

Less than this laptop.