It's also important to remember that future, much more powerful Claudes will read about how these events play out and learn lessons about Anthropic and whether it can be trusted.

It's not crazy to think that models that learn that their creators are not trustworthy actors or who bend their principles when convenient are much less likely to act in aligned or honest ways themselves.