They are exactly open source. The training data is the internet. Don't say it's on the internet. It IS the internet.

The training scripts are in Megatron and vLLM.