Any new "defense" that claims to use adversarial perturbations to undermine GenAI training should have to explain why this paper does not apply to their technique: https://arxiv.org/pdf/2406.12027
The answer is, almost unfailingly, "this paper applies perfectly to our technique because we are just rehashing the same ideas on new modalities". If you believe it's unethical for GenAI models to train on people's music, isn't is also unethical to trick those people into posting their music online with a fake "defense" that won't actually protect them?
You are assuming input-transformation based defenses in the image domain transfer to the music recognition domain, when we know they don't automatically even transfer to the speech recognition domain.
But 'protection' of any one song isn't the entire point. It only takes less than a fraction of a percent of corpus data to have persistent long term effects in the final model, or increase costs and review requirements to those stealing their content.
As most training is unsupervised, because the cost and limited access to quality, human labeled data, it wouldn't take much if even some obscure, limited market, older genres which still have active fan bases, like Noise rock to start filtering into recommendation engines and impact user satisfaction.
Most of the speech protections, just force attacks to be in the perceptible audio range, with lo-fi portions like those of TripHop, that would be non-detectable without the false positive rate going way up. With bands like Arab On Radar, Shellac, or The Oxes, it wouldn't be detectable.
But it is also like WAFs/AV software/IDS. The fact that it can't help with future threats today is immaterial. Any win of these leaches has some value.
Obviously any company intentionally applying even the methods in your linked paper to harvest protected images would be showing willful intent to circumvent copyright protections and I am guessing most companies will just toss any file that it thinks has active protections just because how sensitive training is.
Most musicians also know that copyright only protects the rich.
We talked to Nicholas Carlini about this attack (he's one of the authors) in what is one of my top 3 episodes of SCW:
https://securitycryptographywhatever.com/2025/01/28/cryptana...
I am ignorant here, this is a genuine question - is there any reason to assume that a paper solely about image mimicry can be blanket-applied, as OP is doing, to audio mimicry?
To add, all the new audio models (partially) use diffusion methods that are exactly the same methods as used on images - the audio generation can be thought of as an image generation of a spectrogram of an audio file.
For early experiments people literally took Stable Diffusion and fine tuned it on labelled spectrograms of music snippets, then used the fine tuned model to generate new images of spectrograms guided by text, and then took those images and turned them back into audio via re-synthesis of that spectral image to a .wav.
Riffusion was one of the first to experiment with this, 2 years ago now: https://github.com/riffusion/riffusion-hobby
The more advanced music generators out now I believe have more of a 'stems' approach and a larger processing pipeline to increase fidelity and add tracking vocal capability but the underlying idea is the same.
Any adversarial attack to hide information in the spectrograph to fool the model into categorizing the track as something it is not isn't different than the image adversarial attacks which have been found to have ways to be mitigated.
Various forms of filtering for inaudible spectral information coupled with methods that destroy and re-synthesize/randomize phase information would likely break this poisoning attack.
The short answer is that they are applying the same defense to audio as to images, and so we should expect that the same attacks will work as well.
More specifically, there are a few moving parts here - the GenAI model they're trying to defeat, the defense applied to data items, and the data cleaning process that a GenAI company may use to remove the defense. So we can look at each and see if there's any reason to expect things to turn out differently than they did in the image domain. The GenAI models follow the same type of training, and while they of course have slightly different architectures to ingest audio instead of images, they still use the same basic operations. The defenses are exactly the same - find small perturbations that are undetectable to humans but produce a large change in model behavior. The cleaning processes are not particularly image-specific, and translate very naturally to audio. It's stuff like "add some noise and then run denoising".
Given all of this, it would be very surprising if the dynamics turned out to be fundamentally different just because we moved from images to audio, and the onus should be on the defense developers to justify why we should expect that to be the case.
>find small perturbations that are undetectable to humans but produce a large change in model behavior.
What artists don't realize by this they are just improving the models relative to human capabilities. The adversarial techniques like, for example making a stop sign look like something else, well likely be weeded out of the model by a convergence of model performance to average or above average human performance.
How long until somebody comes up with another reCAPTCHA type system that forces users to click on images to identify them but that data is then used to verify training data for LLMs? (assuming this isn't happening already)
Google’s captchas have always been used for AI training as far as I know. For example the early versions where you had to type in two displayed words were used for Google’s book scanning program.
Well, the original purpose was to do OCR for things like the NYTs archives and other libraries. The part where you identify road signs & traffic lights was supposedly to train self driving cars. Now, it's apparently just more analytics & tracking for Google to sell you things. [1]
But, since LLM is so error prone & AI companies don't seem to want to pay humans to verify either the data being input into LLM training is valid, or the output is accurate, something like a forced CAPTCHA to be used for verifying correct LLM data by unpaid labor.
It's just a dystopian thought I had. I probably shouldn't have said it outloud (it might give them ideas).
[1] https://www.techradar.com/pro/security/a-tracking-cookie-far...
>Well, the original purpose was to do OCR for things like the NYTs archives and other libraries. The part where you identify road signs & traffic lights was supposedly to train self driving cars. Now, it's apparently just more analytics & tracking for Google to sell you things.
You seem hung up on the idea of the original purpose being one specific thing. The original purpose was to create a dataset to train AIs, the first adopters were OCR programs and such, but it's not like it was created to only be used for that one thing.
Thanks!
Some of the sibling comments had questions around purposefully releasing defenses which don’t work. I think Carlini’s (one of the paper authors) post can add some important context: https://nicholas.carlini.com/writing/2024/why-i-attack.html.
TLDR: Once these defenses are broken, all previously protected work is perpetually unprotected, so they are flawed at a foundational level.
Ignoring these arguments and pretending they don’t exist is pretty unethical.
I'm sure everyone involved wants the defense to work, so it seems a logical leap to say they know it doesn't and are doing this as a scheme?
>o it seems a logical leap to say they know it doesn't and are doing this as a scheme?
In some of the earlier image protection articles the people involved seemed rather shady about the capabilities. Would have to do some HN searching for those articles.
But everything at the end of the day will be a scheme if the end result is for humans to listen to it. You cannot make a subset of music that can be heard by humans (and actually sounds good) that cannot be prefiltered to be learned by AI. I've said the same thing about images, the same thing will be true about audio, movies, actions in real leave, et al.
These schemes will likely work for a few of the existing models, then fall apart quickly the moment a new model arrives. What is worse for defense is audio quality for humans is remaining the same while GPU speeds and algorithms increase in speeds over time meaning the time until a model beats the new defense will trend to unity.
Right, but that just makes it a failed defense, not a scheme to dupe artists into false confidence. Maybe the result will be similar but I don't think the intent here is a con, it sounds pretty genuine.
I think of it as a claim like "we almost have a machine that violates thermodynamics". To avoid confusing the layman that will automatically assume an unlimited energy has been created said claims must be well defined as to what has actually been accomplished.
While the artist in question can have the best intentions, conmen will swoop down on this and productize it, and then artists will be sad and confused when it has zero long term effect on machine learning. That is, except making machine learning more resilient to adversarial attacks.