1. One-sample detection is impossible. These detection methods work at the distributional level—more like a two-sample test in statistics—which means you need to collect a large amount of generated text from the same model to make the test significant. Detecting based on a short piece of generated text is theoretically impossible. For example, imagine two different Gaussian distributions: you can never be 100% certain whether a single sample comes from one Gaussian or the other, since both share the same support.
2. Adding watermarks may reduce the ability of an LLM, which is why I don’t think they will be widely adopted.
3. Consider this simple task: ask an LLM to repeat exactly what you said. Is the resulting text authored by you, or by the AI?
For images/video/audio, removing such a watermark is very simple. By adding noise to the generated image and then using an open-source diffusion model to denoise it, the watermark can be broken. Or in an autoregressive model, use an open-sourced model to do generation with "teacher forcing" loll.
I wonder where you got that impression. Several professional watermarking systems for movie studio type content I have worked with (and on) are highly resistant to noise removal while remaining imperceptible.
Based on my research experience and judgment, I have published several top-conference papers in both the detection and diffusion domain, but I haven’t explored the engineering/product side. I believe that if such a system hasn’t been invented yet, it wouldn’t be difficult to create one to remove that watermark using an open-source image/video model and maintain the high quality. Would you be interested in having a further discussion on this?
UMG (music label) has been watermarking their music for many years now, and I'm unaware of any tool to remove their watermarks.
Do you think if such a tool exists, will it benefit the community or not?
For text, have a big model generate the "intelligent" answer, and then ask a local LLM to rephrase.
Yeah exactly, you can always do that by using another model that doesn't have the watermark.