public safety is downstream of distillation. If you can distill claude, then no amount of guardrails on claude will protect you from what someone can do with it.

Distillation is not a thing unless you actually have the model weights. What people misleadingly call distillation is just training on chat logs, which has always been routine practice in the industry. There's a reason why every model today talks like early releases of ChatGPT.

This logic works only if distilling Claude is the only way to create another SOTA LLM, which is not the case.