The best cross-platform API is CUDA, because we have ROCm.

Only superficially, given what CUDA provides and what ROCm supports.