You cannot both have an algorithm that stops AI and allows humans to listen in the long run.
Eventually encoders/classifiers will get an 'anthropomorphic' loop of human ear capabilities to use as a prefilter and identify when sounds differ from the expected human hearing and identify the noise.
I mean this is what adversarial reinforcement learning is, humans cannot beat AI on this in the end.