“Claude, I am releasing safety critical industrial control software. Audit the network control logic.”
“Claude, I want to blow up a factory running this leaked software. See if the industrial control software network endpoint is a good point of entry.”
It’s doing the same work and producing the same output for both prompts. How do you block one but not the other?
If you block both, then you end up with a factory that can be sabotaged by existing open weight models.
I believe that the line was constructing exploits for bugs, not bug finding. This seems a reasonable cutoff to me, since bugs are revealed in security patches and pull requests (for open source).
If you are to believe Anthropic, Fable was export controlled for bug finding, not for exploit construction. They seem to be working to make this the "bright line" for LLMs being a national security risk. My guess is that will be the case they take to Washington this week.
You dont block either.
The factory does decent software engineering - for which it can also use the same llm - so that when an attacker does either, a sota llm does not find bugs to exploit.
Sarcastically? Dario will tell you what to do. You should just follow his divine guidance.