Has anyone tried this for LLMs and how does it compare to Vulkan?

Iirc he was kind-of funded by AMD for a while to break the CUDA LLM moat (the mention of change of direction at the bottom of the article), no idea if it was legal issues with him being paid by them or why it changed, but..

1: Much initial research and early OSS was CUDA focused due to libaries and momentum, but with more Mac users,etc kernels/frameworks are more portable so with stuff like Unsloth so the CUDA moat is falling there (If it's Vulkan or more propietary backends doesn't matter, Vulkan or OpenCL are probably fairly equal in usefulness since Vulkan support would probably require custom extensions for full performance).

2: As alluded above, ZLuda seems to have focused a bit on ROCm API's for AMD (I think I've seen mentions of other library supports but it's been up and down iirc since he was employed by AMD). The main benefit compared to Vulkan iirc is that ROCm allows for some CUDA-like feature that makes kernel emulation so much less troublesome (or even feasible at all?).

Do other models - VLMs, Diffusion, etc., also work fine with Vulcan?