I really hope SynthID becomes a widely adopted standard - at the very least, Google should implement it across its own products like NotebookLM.
The problem is becoming urgent: more and more so-called “podcasts” are entirely fake, generated by NotebookLM and pushed to every major platform purely to farm backlinks and run blackhat SEO campaigns.
Beyond SynthID or similar watermarking standards, we also need models trained specifically [0] to detect AI-generated audio. Otherwise, the damage compounds - people might waste 30 minutes listening to a meaningless AI-generated podcast, or worse, absorb and believe misleading or outright harmful information.
[0] 15,000+ ai generated fake podcasts https://www.kaggle.com/datasets/listennotes/ai-generated-fak...
Given there is "misleading or outright harmful" information generated by humans, why is it more pressing that we track such content generated by AI?
I suppose efficiency? It's easier to filter out petabytes of AI slop than to determine which human generated content is harmful