Hacker News

fragmede 8 hours ago [ - ]

What custom prompt do you have set up? If you tell it you're occupation, does it turn helpful? There was a study that if you tell models they tested that you're a patient, it would refuse, but tell it you're a doctor and suddenly it turns helpful.

garciasn 7 hours ago [ - ]

According to the model, it’s not the model itself that’s doing this, it’s the harness.

Assuming the model is being “truthful”, CC is just being stupid in its detection mechanism.