Adding a support for new hardware to PyTorch is actually quite convenient. I did that with WebGPU using the same PrivateUse1 mechanism TorchTPU used. Every hardware has its own slot and identifier, and when you want to add a support for a new one without merging it into PyTorch, PrivateUse1 works essentially like plug-in slot

https://github.com/jmaczan/torch-webgpu