It's because the dataset is all algorithmically lossy compressed music, and not the real source

Basically made with pirated mp3s