I'd say the majority of the training data is reddit with zero care about whether it's from a "good" or "sarcastic" or "ironic" source.