There is also the problem that WebGPU doesn't really add much except for compute shaders. Older WebGL apps have hardly any reason to port. Other problem is that WebGPU is even worse outdated than WebGL was at its release. When WebGL was released, it was maybe 5 years outdated. WebGPU somewhat came out in major desktop browsers this year, and by now it's something like 15 years behind the state of the art. OpenGL, which got de facto deprecated more than half a decade ago, is orders of magnitude more powerful with respect to hardware capabilities/features than WebGPU.

This comparison is kind of sloppy, though. OpenGL on the desktop needs to be compared to a concrete WebGPU implementation. While it still lags behind state of the art, `wgpu` has many features on desktop that aren't in the standard. For example, they've started working on mesh shaders too: https://github.com/gfx-rs/wgpu/issues/7197. If you stick to only what's compatible with WebGL2 on the desktop you'd face similar limitations.

I'm of course talking about WebGPU for web browsers, and I'd rather not use a graphics API like wgpu with uncertain support for the latest GPU features. Especially since wgpu went for the same paradigm as Vulkan, so it's not even that much better to use but you sacrifice lots of features. Also Vulkan seems to finally start fixing mistakes like render passes and pipelines, whereas WebGPU (and I guess wgpu?) went all in.

Saying WebGPU “only” adds compute shaders is crazy reductive and misses the point entirely for how valuable an addition this is, from general purpose compute through to simplification of rendering pipelines through compositing passes etc.

In any case it’s not true anyway. WebGPU also does away with the global state driver, which has always been a productivity headache/source of bugs within OpenGL, and gives better abstractions with pipelines and command buffers.

I disagree. Yes, the global state is bad, but pipelines, render passes, and worst of all static bind groups and layouts, are by no means better. Why would I need to create bindGroups and bindGroup layouts for storage buffers? They're buffers and references to them, so let me just do the draw call and pass references to the ssbos as argument, rather than having to first create expensive bindings, with the need to cache them because they are somehow expensive.

Also, compute could have easily been added to WebGL, making WebGL pretty much on-par with WebGPU, just 7 years earlier. It didn't happen because WebGPU was supposed to be a better replacement, which it never became. It just became something different-but-not-better.

If you'd have to do even half of all the completely unnecessary stuff that Vulkan forces you to do in CUDA, CUDA would have never become as popular as it is.

I agree with you in that I think there's a better programming model out there. But using a buffer in a CUDA kernel is the simple case. Descriptors exist to bind general purpose work to fixed function hardware. It's much more complicated when we start talking about texture sampling. CUDA isn't exactly great here either. Kernel launches are more heavyweight than calling draw precisely because they're deferring some things like validation to the call site. Making descriptors explicit is verbose and annoying but it makes resource switching more front of mind, which for workloads primarily using those fixed function resources is a big concern. The ultimate solution here is bindless, but that obviously presents it's own problems for having a nice general purpose API since you need to know all your resources up front. I do think CUDA is probably ideal for many users but there are trade-offs here still.

It didn't happen because of Google, Intel did the work to make it happen.