Exactly, this is so flawed. Anthropic themselves said they only reported <1% of the vulnerabilities found, cause the rest is unpatched.

Give open models an environment (prior to Feb 15- so no Mythos-discovered vulns are patche) of Linux and see how many vulnerabilities it can find. Then put it in a sandbox and see if it can escape and send you an e-mail.