The competition to CUDA and proprietary 3D APIs always overlooks developer productivity.

For some strange reason there is this expectation, maybe due to UNIX background of those folks, that portable APIs have to exist without good IDE tooling, no graphical debuggers, no high level programming models, no libraries ecosystem.

Then for some "strange" reason, GPU developers mostly pick proprietary and the cycle repeats itself.

But the Modular stack is focused on developer productivity. It is still early but there has been substantial work on all these

I am yet to see the same Windows love as CUDA.

Same to IDE integration and graphical debugging experience for GPU code.

Until now, it was been the usual UNIX cli, and text mode lldb like debugging for CPU side.

At least it what I have been made aware of.

It doesn't have a lot of Windows support yet because nobody deploys datacenter-scale AI serving on Windows OS.