> I don't like being part of an AI dataset.
This is understandable, but I'm sure all the HN comments have been a part of training dataset for many chatbots now. In fact, this is a gold mine of sane and valuable sanctuary of comments, so this must have been definitely helpful.
True, but I find it fairly offensive that my own data is being sold back to me. If it was free I'd be more tolerant. They say the model is costly and I believe it, but what exactly are the margins here? I feel like I've been recruited into some lame hustle.
Thank you for the feedback. This product was made to connect to your own database. I thought it was fun to connect it to the HN bigquery public dataset. We are break even on a good month.
I hope you didn't read what I said as a personal attack, it's not, that's just my feedback on how I feel about this particular idea. I will say that it is clever though, even if it's definitely not for me.
I think the "ick" factor for me comes from the feeling that social engagement shouldn't really be queryable. When I participate here, it's an in-the-moment thing. While I realize my opinions are stored forever and searchable, and I generally stand by most of what I say, I think making meta-products around social engagement changes the flavor and the feeling of how we interact. It's like when someone points a camera at you. Sure, it doesn't really change anything, but also, it completely changes things right?
HN is a loss leader for ycombinator. It's literally a venture capital firm, lmao.
Its not the paint you are buying, its the painting.
> I'm sure all the HN comments have been a part of training dataset for many chatbots now.
Be that as it may, I don’t think “everyone does it” is an excuse. An absurdly high number of people throw trash on the floor. I actively pick it up or at a minimum don’t contribute to the problem.
The answer to “many companies are unethically gathering your data” is not “it’s OK for me to be unethical too”.
Completely agree; I was not justifying any company nor am I saying it is ethical. I'm just saying that regardless of your stance, the dataset has been utilized already.