Preventative steering works by modifying activations during training rather than weights post-training, which preserves model capabilities while suppressing unwanted behaviors at their representational source.