Training data is here:

https://huggingface.co/datasets/arman-bd/guppylm-60k-generic