I wish there was some easy way to bet against this happening. I would put a lot of money on the side of this never happening for a multitude of reasons, but I bet I could collect a lot of money from cynics and doomers who think this stuff will happen.
As a devil's advocate, why do you trust the AI companies to behave as you suggest and not the other way? You say you have multitude of reasons, but list none. We have already seen by example that the AI companies do not care about laws and will circumvent societal norms as long as they get a leg up, so it's not a stretch to think they'd do things like this too.
It isn't just out of the kindness of their hearts that they don't do this. There are laws and regulations. There is also legal risk and reputation. I have to go through a legal and privacy process at my big corp job whenever I want to record a new timestamp and I need to ensure that the data is used appropriately and that it is wiped later. I've only seen these compliance requirements become more onerous over the past ten years and I expect that to continue.
> There are laws and regulations. There is also legal risk and reputation.
One of the big companies, Meta, already decided to go ahead and grab terabytes of pirated books to feed their LLM. [0]
Therefore I would not give them (or similar entities) the benefit of the doubt when it comes to how they might use text that customers "gave" them under some unreadably-favorable terms of service.
With PII, the pirated-books example is doubly-relevant, because the accusation of "this output is reproducing my copyright work" is very similar to "this output is revealing my private data". The fuzzy black-box nature of the algorithms offers ways to stymie enforcement, arguing that victims or regulators cannot conclusively prove a chain of cause with zero coincidences.
[0] https://www.theatlantic.com/technology/archive/2025/03/libge...
Is the reputational risk of pirating terabytes of books worse than the reputational risk of shredding (destructively scanning) millions of books?
https://arstechnica.com/ai/2025/06/anthropic-destroyed-milli...
Fair enough. I don't use Facebook at all because I don't respect or trust the company or it's mission. I do use Gemini and Claude though.
Google is an ad company, I'd be very.... cautious with the trust here.
https://apnews.com/article/google-smartphone-surveillance-ve...
Why? What has Google or Anthropic done that suggests they are trust worthy? Google is infamous for not not being evil. It's not like either asked for permission to access copyrighted material either. Not one tech company deserves trust. They all should be treated as suspect. I don't expect anyone to trust anything I make for the simple reason I don't trust anything anyone else makes.
More specifically, the CEO said that users are "dumb f*cks" for submitting data to Facebook, the predecessor of Meta.
You wouldn't win because those cynics don't really believe their own nonsense to the extent of risking money over it. But if there was an option to bet, one we could point them to and say, "if you really believe it then here's your chance at free money", maybe some of them would reconsider their belief.
Dunno about that near-blackmail scenario, but 23andMe filed for Chapter 11 last year and the database was sold for $305m.
People are rightly worried about that, but is there any indication that it nullifies any privacy contracts around the data? Is it:
1) We know that legally privacy terms to data are still binding, and those worried about it are freaking out over nothing,
2) We know that those contracts are null and void, and there are no restrictions on what can be done with that data beyond blanket legal protections to such biological data, or
3) It's an open legal question
I don't understand the legal terms of something like this in bankruptcy, if the data are seen as being separated from the contractual obligations that acquired them.
And the government is sleeping and mostly worried about how to implement id verification........
Yeah, but someone at one of the LLM providers would bet against you and do it, just to take your money. If someone bets $100k your house doesn't burn down with pictures posted within 30min of it happening, it probably will.