Realistically this data is going to be used to train a closed source model, not to contribute to the scientific literature.