it feels like Nvidia has 30 "tile-based DSLs with python-like syntax for ML kernels" that are in the works lol. I think they are very worried about open source and portable alternatives to cuda.
it feels like Nvidia has 30 "tile-based DSLs with python-like syntax for ML kernels" that are in the works lol. I think they are very worried about open source and portable alternatives to cuda.
Not at all, they are the ones pushing for vendor agnostic Tensorcore extensions in Vulkan, which would solve some part of the portability issue: https://github.com/jeffbolznv/vk_cooperative_matrix_perf