Let's pretend for a moment that the entire training corpus for Deepseek-R1 were released.

How would you download it?

Where would you store it?

I mean many people I know have 100tb+ in storage at home now. A large enough team of dedicated community members cooperating and sharing compute resources online should be able to reproduce any model.