I think there might be a confusion here? The 100% seems like the true positive rate (correct detection), not the false positive rate?

Nope, 9 of 9 legit sites were incorrectly flagged:

> The tradeoff is that it flagged all 9 of the legitimate sites in our dataset as suspicious