I don't trust any AI company not to use and monetise my data, regardless how much I pay or regardless what their terms of service say. I know full well that large companies ignore laws with impunity and no accountability.

I would encourage you to rethink this position just a little bit. Going through life not trusting any company isn't a fun way to live.

If it helps, think about those company's own selfish motivations. They like money, so they like paying customers. If they promise those paying customers (in legally binding agreements, no less) that they won't train on their data... and are then found to have trained on their data anyway, they wont just lose that customer - they'll lose thousands of others too.

Which hurts their bottom line. It's in their interest not to break those promises.

> they wont just lose that customer - they'll lose thousands of others too

No, they won't. And that's the problem in your argument. Google landed in court for tracking users in incognito mode. They also were fined for not complying with the rules for cookie popups. Facebook lost in court for illegally using data for advertising. Did it lose them any paying customer? Maybe, but not nearly enough for them to even notice a difference. The larger outcome was that people are now more pissed at the EU for cookie popups that make the greed for data more transparent. Also in the case of Google most money comes from different people than the ones that have their privacy violated, so the incentives are not working as you suggest.

> Going through life not trusting any company isn't a fun way to live

Ignoring existing problems isn't a recipe for a happy life either.

Landing in court is an expensive thing that companies don't want to happen.

Your examples also differ from what I'm talking about. Advertising supported business models have a different relationship with end users.

People getting something for free are less likely to switch providers over a privacy concern compared with companies is paying thousands of dollars a month (or more) for a paid service under the understanding that it won't train on their data.

>Landing in court is an expensive thing that companies don't want to happen.

"If the penalty is a fine, it's legal for the rich". These businesses also don't want to pay taxes or even workers, but in the end they will take the path of least resistence. if they determine fighting in court for 10 years is more profitable than following regulations, then they'll do it.

Until we start jailing CEO's (a priceless action), this will continue.

>companies is paying thousands of dollars a month (or more) for a paid service under the understanding that it won't train on their data.

Sure, but are we talking about people or companies here?

CEO says the action was against policy and they didn't know, so the blame passes down until you get to a scapegoat that can't defend themselves.

The underlying problem is that we have companies with more power than sovereign states, before you even include the power over the state the companies have.

At some point in the next few decades of continued transfer of wealth from workers to owners more and more workers will snap and bypass the courts. The is what happened with the original fall of feudalism and warlords. This wasn't guaranteed though -- if the company owners keep themselves and their allies rich enough they will be untouchable, same as drug lords.

> Until we start jailing CEO's (a priceless action)

In the context of the original thread here: If all you need to do is go to jail then whatever that's for was "for free"!

I can't agree with a 'companies won't be evil because they will lose business if people don't like their evilness!' argument.

Certainly, going through life not trusting any company isn't a fun way to live. Going through life not trusting in general, isn't a fun way to live.

Would you like to see my inbox?

We as tech people made this reality through believing in an invisible hand of morality that would be stronger than power, stronger than the profit motives available through intentionally harming strangers a little bit (or a lot) at scale, over the internet, often in an automated way, if there was a chance we'd benefit from it.

We're going to have to be the people thinking of what we collectively do in this world we've invented and are continuing to invent, because the societal arbitrage vectors aren't getting less numerous. Hell, we're inventing machines to proliferate them, at scale.

I strongly encourage you to abandon this idea that the world we've created, is optimal, and this idea that companies of all things will behave ethically because they perceive they'll lose business if they are evil.

I think they are fully correct in perceiving the exact opposite and it's on us to change conditions underneath them.

My argument here is not that companies will lose customers if they are unethical.

My argument is that they will lose paying customers if they act against those customer's interests in a way that directly violates a promise they made when convincing their customers to sign up to pay them money.

"Don't train on my data" isn't some obscure concern. If you talk to large companies about AI it comes up in almost every conversation.

My argument here is that companies are cold hearted entities that act in their self interest.

Honestly, I swear the hardest problem in computer science in 2025 is convincing people that you won't train on your data when you say "we won't train on your data".

I wrote about this back in 2023, and nothing has changed: https://simonwillison.net/2023/Dec/14/ai-trust-crisis/

I think you're making good points, that aren't exactly counter-examples to the concerns being raised.

You are making the - correct! - point that _other companies_ who have paid contracts with an AI provider would impose significant costs on that provider if those contracts were found to be breached. Either the company would leave and stop paying their huge subscription, and/or the reputational fallout would be a cost.

But companies aren't people, and are treated differently from people. Companies have lawyers. Companies have the deep pockets to fight legal cases. Companies have publicity reach. If a company is mistreated, it has the resources and capabilities to fight back. A person does not. If J. Random Hacker somehow discovers that their data is being used for training (if they even could), what are they gonna do about it - stop paying $20/month, and post on HN? That's negligible.

So - yes, you're right that there are cold-hearted profit-motivated self-interested incentives for an AI provider to not breach contract to train on _a company's_ data. But there is no such incentive protecting people.

EDIT: /u/johnnyanmac said it better than me:

>> If they promise those paying customers (in legally binding agreements, no less) that they won't train on their data... and are then found to have trained on their data anyway, they wont just lose that customer - they'll lose thousands of others too.

> I sure wish they did. In reality, they get a class action, pay off some $100m to lawyers after making $100b, and the lawyers maybe give me $100 if I'm being VERY generous, while the company extracted $10,000+ of value out of me. And the captured market just keeps on keeping on.

Yes, my argument is mainly with respect to paying customers who are companies, not individuals.

I have trouble imagining why a company like Anthropic would go through the additional complexity of cheating their individual customers while not doing that to their corporate customers. That feels like a whole lot of extra work compared to just behaving properly.

Especially given that companies consist of individuals, so the last thing you want to do is breach the privacy of a personal account belonging to the person who makes purchasing decisions at a large company!

I mean this earnestly, not snidely - I wish I still had the faith that you do in not being treated abominably by any and every company, or to believe that they wouldn't default to behaving improprerly at any opportunity and for the barest profit margin. It would be nice to still believe that good things could happen under capitalism.

(as a sidenote, I'm very grateful for your insightful and balanced writing on AI in general. It played a considerable part in convincing me to give AI tooling another go after I'd initially written it off as more trouble than it was worth)

>Going through life not trusting any company isn't a fun way to live.

Isn't that the Hacker mindset, though? We want to trailblaze solutions and share it with everyone for free. Always in liberty and oftentimes in beer too. I think it's a good mentality to have, precisely because of your lens of selfish motivations.

Wanting money is fine. If it was some flat $200 or even $2000 with legally binding promises that I have an indefinitely license to use this version of the software and they won't extract anything else from me: then fine. Hackers can be cheap, but we aren't opposed to barter.

But that's not the case. Wanting all my time and privacy and data under the veneer of something hackers would provide with no or very few strings is not. tricks to push into that model is all the worse.

> If they promise those paying customers (in legally binding agreements, no less) that they won't train on their data... and are then found to have trained on their data anyway, they wont just lose that customer - they'll lose thousands of others too.

I sure wish they did. In reality, they get a class action, pay off some $100m to lawyers after making $100b, and the lawyers maybe give me $100 if I'm being VERY generous, while the company extracted $10,000+ of value out of me. And the captured market just keeps on keeping on.

Sadly, this is not a land of hackers. It is a market of passive people of various walks of life: of students who do not understand what is going on under the hood (I was here when Facebook was taking off), of businsessmen too busy with other stuff to understand the sausage in the factory, of ordinary people who just wants to fire and forget. This market may never even be aware of what occurred here.

This is so naive