> That kind of thing is surprisingly hard to implement.

If response contains prompt text verbatim (or it is below some distance metric) replace the response text.

Not saying it's trivial to implement (and probably it is hard to do in a pure LLM way), but I don't think it's too hard.

More like it's not really a big secret.