Except that, all Deepseek models so far have been trained on Nvidia hardware. For Deepseek v3, they literally mention that they used 2,048 NVIDIA H800 GPUs right in the abstract: https://arxiv.org/html/2505.09343v1