Hacker News

the upside of reddit data is you have updoots and downdoots, so you can positively and negatively train your AI model on what people would typically upvote, and train against what they might downvote

Now, that's the upside, the downside is you end up with an AI catering to the typical redditor. Since many claims there are formed on the basis of, "confident, sounds reasonable, speaks with authority, gets angry when people disagree" - hallucinations happen. Rather we want something like "produces evidence-based claims with unbiased data sources"