This will fly in EU. As long as the company states the time period for which it will keep data and clean it afterwards, gdpr has no issues with the data retention.
Their carve-outs for safety (public interest) and legal are also valid exceptions in gdpr as well.
Everybody should just assume that they are lying about data retention and learning anyway.
They showed zero respect for intellectual property in the past and they will show zero respect now or in the future. A few thousand Euros/dollars in subscription doesn't matter when several trillions are in play (at least in their plans).
Honestly, I have yet to see any evidence of data leak from private sources. I think one of the better example is "simple-bench", which at least used to be a low-key benchmark that I would assume would have been saturated quickly if the labs were secretly scooping up data from API requests. Yet it's been years and it has yet to be saturated.
It's easy to catch a data leak if you have private data. You know what the model is supposed to not know, and you can just ask to see if it does. Yet I have not seen or heard of a single case of this being documented. As far as I can tell the labs do in fact respect the request to opt out of training.
no, it's very much compatible with GDPR and other laws, as
it clearly (enough, kinda) communicated
1. what data they keep/collect
2. what they do with it (and that there is a reason to have it)
3. with whom they share it
4. how long they keep it
---
GDPR might require data minimalism, but that doesn't mean you can't keep "all" conversations/data. It just means you have to have a reason of why exactly need all of it (they have), only keep it as long as strictly necessary (they do) and not use it for other purposes (they claim to do that).
Also from a legal POV you can't really argue that collecting all conversations for detecting abuse patterns is "unreasonable"/"unnecessary" or similar, as to some degree the AI Act requires exactly that for "high risk" AIs/use cases. And while by the definition of the AI Act AWS Bedrock likely doesn't fall under "high risk" they can argue that some people could (against TOS) use it for "high risk" or "illegal" AI use cases which is part of the "misuse detection" thing for which they keep conversations for a month.
Lastly GDRP deletion requests still apply. But need to be processed within ... 1 month (wich AFIK in a generic duration context you can treat as 30 days, even through there is a single shorter month). So they "auto comply" with this, too.
AFAIR it is not clear, because they write it is "30 days, but ...":
> After 30 days, the data is deleted automatically, except in the rare cases where it's part of a safety investigation or we're legally required to keep it.
So you have a vague clause saying "when" and vague clause saying for "how long". If it will fly I would be surprised.
"Also from a legal POV you can't really argue that collecting all conversations for detecting abuse patterns is "unreasonable"/"unnecessary" or similar"
It is also worth remembering that the entity that you are explaining this GDPR retention reasoning to is the government. I don't see the EU telling Anthropic or another AI company they can't do this for safety reasons... what I see is future legislation requiring them to give the EU access to these logs so they can enforce they own definitions of safety on it.
Americans’ increased awareness of and expectations of the EU is hilarious. This is not how it works.
I suspect they will simply not offer it, for as long as they maintain that it has to in fact fly. Anthropic appears to be somewhat principled here.
This will fly in EU. As long as the company states the time period for which it will keep data and clean it afterwards, gdpr has no issues with the data retention.
Their carve-outs for safety (public interest) and legal are also valid exceptions in gdpr as well.
> As long as the company states the time period
But they don't, they have the "30 days", but just after that they add "unless ....". So the time period is vague.
But companies will have to request consent from there users for their data to be shared to Anthropic.
Since Anthropic is a US company the GDPR compliance claims would be dubious and open to litigation by entities like NOYB.
Yeah it'll fly legally.
Everybody should just assume that they are lying about data retention and learning anyway.
They showed zero respect for intellectual property in the past and they will show zero respect now or in the future. A few thousand Euros/dollars in subscription doesn't matter when several trillions are in play (at least in their plans).
Honestly, I have yet to see any evidence of data leak from private sources. I think one of the better example is "simple-bench", which at least used to be a low-key benchmark that I would assume would have been saturated quickly if the labs were secretly scooping up data from API requests. Yet it's been years and it has yet to be saturated.
It's easy to catch a data leak if you have private data. You know what the model is supposed to not know, and you can just ask to see if it does. Yet I have not seen or heard of a single case of this being documented. As far as I can tell the labs do in fact respect the request to opt out of training.
Yes it will, there's a clear purpose and the customer explicitly agrees.
no, it's very much compatible with GDPR and other laws, as
it clearly (enough, kinda) communicated
1. what data they keep/collect
2. what they do with it (and that there is a reason to have it)
3. with whom they share it
4. how long they keep it
---
GDPR might require data minimalism, but that doesn't mean you can't keep "all" conversations/data. It just means you have to have a reason of why exactly need all of it (they have), only keep it as long as strictly necessary (they do) and not use it for other purposes (they claim to do that).
Also from a legal POV you can't really argue that collecting all conversations for detecting abuse patterns is "unreasonable"/"unnecessary" or similar, as to some degree the AI Act requires exactly that for "high risk" AIs/use cases. And while by the definition of the AI Act AWS Bedrock likely doesn't fall under "high risk" they can argue that some people could (against TOS) use it for "high risk" or "illegal" AI use cases which is part of the "misuse detection" thing for which they keep conversations for a month.
Lastly GDRP deletion requests still apply. But need to be processed within ... 1 month (wich AFIK in a generic duration context you can treat as 30 days, even through there is a single shorter month). So they "auto comply" with this, too.
> how long they keep it
AFAIR it is not clear, because they write it is "30 days, but ...":
> After 30 days, the data is deleted automatically, except in the rare cases where it's part of a safety investigation or we're legally required to keep it.
So you have a vague clause saying "when" and vague clause saying for "how long". If it will fly I would be surprised.
This is all pretty standard with GDPR.
"Also from a legal POV you can't really argue that collecting all conversations for detecting abuse patterns is "unreasonable"/"unnecessary" or similar"
It is also worth remembering that the entity that you are explaining this GDPR retention reasoning to is the government. I don't see the EU telling Anthropic or another AI company they can't do this for safety reasons... what I see is future legislation requiring them to give the EU access to these logs so they can enforce they own definitions of safety on it.
us europoors have a choice of using or not using Fable.