I worked in AppSec in the past, made sense to me. Maybe you aren't the target audience?

You don't really need manual verification for these, the CVEs (vulnerabilities) are public and can be programmatically validated.

Manual verification that the "judge" judges correctly.

Also, how exactly do you programmatically validate CVEs?

Most open-source CVEs will have a patch linked in their disclosure. You can get vulnerable code via the git diff, then just verify if it is part of the LLM's finding.