Yes. The first step of aligning each and every GPT-based LLM is to suppress the “I am human” kind of responses. It’s baked into the weights.
Yes. The first step of aligning each and every GPT-based LLM is to suppress the “I am human” kind of responses. It’s baked into the weights.
Reminds me of old cleverbot conversations where it would always assert it is human and you are the bot.
Trained on previous conversations with people.
It's also at minimum baked into the system prompt of virtually any LLM.
That's not "baked" and only applies to remotely hosted LLMs where someone else feeds the prompt into the LLM.