Hacker News

> [...] beating Claude Code (32%) at roughly $0.17 per vulnerability found

Claude Code is an agent harness, not an LLM.

Claude is a brand (or group of LLMs), not an LLM.

Yes, and the article author is fully aware of that. Thank you for pointing out this small mistake though.

It looks like the author is specifically avoiding model's name, because results are really weird.

  Opus 4.8/4.7 scored 28%

  Opus 4.6 score 37%

So the author thought as let's not get into that just write Claude.

happycube 21 hours ago [ - ]

Not weird at all, given the variance in Opus' quality over the last few months.

wild guess - I wouldn't be surprised if Opus 4.6 was run quantized for a while, and 4.7/4.8 have QAT for that nerfed size.

andriy_koval a day ago [ - ]

many people think opus 4.6 was the best

insiderphd 10 hours ago [ - ]

Hello! Author here (Katie) Ty for your comments, 4.6 and 4.7 both scored 28% on our benchmark, I just wanted to have 10 things in the list because I wanted a round number.

raincole 18 hours ago [ - ]

Where is the weird part?

croemer 20 hours ago [ - ]

The dollar amount is meaningless without comparison - and no other model has a price tag. Sloppy article.

tills13 a day ago [ - ]

It costs nothing to not be pedantic.

alienbaby 21 hours ago [ - ]

Possibly, nothing other than accuracy

mdp2021 14 hours ago [ - ]

"Kindly reach us in Cambridge for the lessons".

Onavo a day ago [ - ]

Claude code it's the only way to get access to the actual amortized cost of running a Claude-scale model. The consumer non-enterprise API is extremely expensive (with increasing marginal costs for the user and fat profit margins for Anthropic). If you want to approximate a State level attacker's cost where they can have the model on their own hardware, Claude Code is probably the best guess at the amortized cost.