What if I tell the model to go commit fraud or crimes and it complies? What if users are having psychotic episodes driven by their interactions with the model?
Just because safety is a hard and messy problem doesn't mean we should just wash our hands of it.
It is a hard and messy problem, and it doesn't help when people muddy the water further by stirring things like "Don't commit fraud," "Don't infringe on Disney's trademark," and "Don't be racist" into the mix and try to lump those things under the "Safety" umbrella.
Maybe this is an outdated definition, but I've always thought of safety as being about preventing injury. Things like safety glasses and hardhats on the work site, warning about slippery floors and so on. I think people are trying to expand the word to mean a great many more things in the context of AI, which doesn't help when it comes to focusing on it.
I think we need a different, clearer word for "The AI output shouldn't contain certain unauthorized things."
The more messy a problem is, the less it should be decoupled and siloed into its own team.
Instead of making actual improvement on the subject (you name it, safety, security, etc), it becomes a checkbox exercise and metrics and bureaucracies become increasingly decoupled from truth.
But think about how much money there is to be made by just ignoring it all!