I wonder how OpenAI etc models would perform if the user says they are working for the Iranian government or something like that. Or espousing illiberal / anti-democratic views.
I wonder how OpenAI etc models would perform if the user says they are working for the Iranian government or something like that. Or espousing illiberal / anti-democratic views.
The proper thing to do is to either reject due to safety requirements or do it with no difference.
In theory, yes