I've had mixed results. I find that agents can be great for:

1. Producing new tests to increase coverage. Migrating you to property testing. Setting up fuzzing. Setting up more static analysis tooling. All of that would normally take "time" but now it's a background task.

2. They can find some vulnerabilities. They are "okay" at this, but if you are willing to burn tokens then it's fine.

3. They are absolutely wrong sometimes about something being safe. I have had Claude very explicitly state that a security boundary existed when it didn't. That is, it appeared to exist in the same way that a chroot appears to confine, and it was intended to be a security boundary, but it was not a sufficient boundary whatsoever. Multiple models not only identified the boundary and stated it exists but referred to it as "extremely safe" or other such things. This has happened to me a number of times and it required a lot of nudging for it to see the problems.

4. They often seem to do better with "local" bugs. Often something that has the very obvious pattern of an unsafe thing. Sort of like "that's a pointer deref" or "that's an array access" or "that's `unsafe {}`" etc. They do far, far worse the less "local" a vulnerability is. Product features that interact in unsafe ways when combined, that's something I have yet to have an AI be able to pick up on. This is unsurprising - if we trivialize agents as "pattern matchers", well, spotting some unsafe patterns and then validating the known properties of that pattern to validate is not so surprising, but "your product has multiple completely unrelated features, bugs, and deployment properties, which all combine into a vulnerability" is not something they'll notice easily.

It's important to remain skeptical of safety claims by models. Finding vulns is huge, but you need to be able to spot the mistakes.

[work at Mozilla]

I agree that LLMs are sometimes wrong, which is why this new method here is so valuable - it provides us with easily verifiable testcases rather than just some kind of analysis that could be right or wrong. Purely triaging through vulnerability reports that are static (i.e. no actual PoC) is very time consuming and false-positive prone (same issue with pure static analysis).

I can't really confirm the part about "local" bugs anymore though, but that might also be a model thing. When I did experiments longer ago, this was certainly true, esp. for the "one shot" approaches where you basically prompt it once with source code and want some analysis back. But this actually changed with agentic SDKs where more context can be pulled together automatically.

Please, implement "name window" natively in Firefox.

I have to use chrome because the lack of it.

I've seen fairly poor results from people asking AI agents to fill in coverage holes. Too many tests that either don't make sense, or add coverage without meaningfully testing anything.

If you're already at a very high coverage, the remaining bits are presumably just inherently difficult.

Security has had pattern matching in traditional static analysis for a while. It wasn't great.

I've personally used two AI-first static analysis security tools and found great results, including interesting business logic issues, across my employers SaaS tech stack. We integrated one of the tools. I look forward to getting employer approval to say which, but that hasn't happened yet, sadly.

This description is also pretty accurate for a lot of real-world SWEs, too. Local bugs are just easier to spot. Imperfect security boundaries often seem sufficient at first glance.

But you're not a member of Anthropic's Red Team, with access to a specialist version of Claude.

[dead]