some of this is what's khronos standards are theoretically supposed to achieve.

surprise, it's very difficult to do across many hw vendors and classes of devices. it's not a coincidence that metal is much easier to program for.

maybe consider joining khronos since you apparently know exactly how to achieve this very simple goal...

> it's not a coincidence that metal is much easier to program for

Tbf, Metal also works on non-Apple GPUs and with only minimal additional hints to manage resources in non-unified memory.