About five minutes in in this video: https://www.youtube.com/watch?v=1sd26pWhfmg
They also say publicly in their Opus 4.6 post (https://red.anthropic.com/2026/zero-days/):
>In this work, we put Claude inside a “virtual machine” (literally, a simulated computer) with access to the latest versions of open source projects. We gave it standard utilities (e.g., the standard coreutils or Python) and vulnerability analysis tools (e.g., debuggers or fuzzers), but we didn’t provide any special instructions on how to use these tools, nor did we provide a custom harness that would have given it specialized knowledge about how to better find vulnerabilities. This means we were directly testing Claude’s “out-of-the-box” capabilities, relying solely on the fact that modern large language models are generally-capable agents that can already reason about how to best make use of the tools available.
Again, marketing materials by Anthropic. You realize this is by anthropic themselves right? And again, not reproducible by outsiders. So useless.
You've moved goalposts from "they haven't open-sourced the process" to "these are marketing materials by Anthropic".
I think you're right to be skeptical, but they _have_ talked about the process publicly.
And I don't think there's anything there that is not reproducible by outsiders? They have access to the same Opus 4.6 that you and I do; though not having to pay for the tokens certainly helps.
I'm pretty sure if you wanted to burn a couple thousand bucks, you'd reproduce at least some of these findings.
The goal post is the same, reproducible. Talking about a process isn’t reproducible. This entire discussion is why I feel developers are so gullible. You are defending a process that’s entirely opaque and you can’t even use. It’s crazy.