Deepseek zero didn’t mix up all languages in something very efficient?

Interesting thought but I assume a lot of samples in the training corpus are examples of translation between languages and the same text in different languages.