It's not common, but I've personally built APIs where requests for dangerous modifications like this perform a dry run, giving in the response the resources that would be deleted/changed and a random token, which then needs to be provide to actually make the change. The idea was that this would be presented in the UI for the user to confirm, but it should be as useful or more by AI agents. Also, you get the benefit that the token only approves that particular modification operation, so if the resources change in between, you need to reapprove.

I guess we don’t know what the agent would do after seeing these warnings and a request for extra action.

Perhaps it would stop and rethink, perhaps it would focus on the fact that extra action is needed - and perform that automatically.

I suppose the decision would depend on multiple factors too (model, prompt, constraints).

Measure twice cut once seems to be forgotten these days.

As well as: A computer can never be held accountable