> If you ask ChatGPT or Claude to write something like this through their websites, they will refuse. This OpenClaw agent had no such compunctions.

OpenClaw runs with an Anthropic/OpenAI API key though?

I think they’re describing a difference in chat behavior vs API. The API must have fewer protections/be more raw.

Probably pretty big difference in system prompt from using the apps vs hitting the api, not that that’s necessarily what’s happening here. + I think openclaw supports other models / its open source and it would be pretty easy to fork and add a new model provider.

Why wouldn't the system prompt be controlled on the server side of the API? I agree with https://news.ycombinator.com/item?id=47010577 ; I think results like this more likely come from "roleplaying" (lightweight jailbreaking).

The websites and apps probably have a system prompt that tells them to be more cautious with stuff like this, so that AIs look more credible to the general public. APIs might not.

Yea pretty confused by this statement. Though also I'm pretty sure if you construct the right fake scenario[0] you can get the regular Claude/ChatGPT interfaces to write something like this.

[0] (fiction writing, fighting for a moral cause, counter examples, etc)