Grok 3.5: 400M training run DeepSeek R1: 5M training run Released around the same time, marginal performance difference.

I suspect that says more about Grok than anything else.