Thats how a lot of application layer startups are going to make money. There is a bunch of high quality usage data. Either you monetize it yourself (cursor), get acquired (windsurf) or provide that data to others at a fee (lmsys, mercor). This is inevitable and a market for this is just going to increaase. If you want to prevent this as an org, there arent many ways out. Either use open source models you can deploy, or deal directly with model providers where you can sign specific contracts.
And you are getting something valuable in return. It's probably a good trade for many, especially when they are doing something like summarizing a public article.
I'm not so sure. I have agents that do categorization work. Take a title, drill through a browse tree to find the most applicable leaf category. Lots of other classification tasks that are not particularly sensitive and it's hard to imagine them being very good for training. Also transformations of anonymized numerical data, parsing, etc.
Thats how a lot of application layer startups are going to make money. There is a bunch of high quality usage data. Either you monetize it yourself (cursor), get acquired (windsurf) or provide that data to others at a fee (lmsys, mercor). This is inevitable and a market for this is just going to increaase. If you want to prevent this as an org, there arent many ways out. Either use open source models you can deploy, or deal directly with model providers where you can sign specific contracts.
You're actually sending data to random GPUs connected to one of the Bittensor subnets that run LLMs.
That can, today, collect that data and sell it. There is work being done to add TEE, but it isn't live yet.
Not every prompt is privacy sensitive.
For example you could use it to summarize a public article.
Every prompt is valuable.
And you are getting something valuable in return. It's probably a good trade for many, especially when they are doing something like summarizing a public article.
I'm not so sure. I have agents that do categorization work. Take a title, drill through a browse tree to find the most applicable leaf category. Lots of other classification tasks that are not particularly sensitive and it's hard to imagine them being very good for training. Also transformations of anonymized numerical data, parsing, etc.
"one man's garbage is another man's treasure"
Using an AI for free is also valuable. Seems win/win.
This isn’t about reciprocal value. Even if something isn't privacy sensitive, it still holds value.
[dead]