Can users poison the future training dataset by ending every chat with a dissatisfied feedback even though the chat helped them? Or be even more malicious and steer the chat into destructive behavior and then give very positive feedback?
Can users poison the future training dataset by ending every chat with a dissatisfied feedback even though the chat helped them? Or be even more malicious and steer the chat into destructive behavior and then give very positive feedback?