So, the false negative rate was 84%, but what was the false positive rate?
They have a table "AUTOMATIC SCAN RESULTS (263 URLS)" that sort of presents this information. Of the 9 sites that were negatives, they say they incorrectly flagged 6 as phishing.
With a false positive rate of 66%, it's not surprising they were able to drive down their false negative rate. Also, the test set of 254 phishing sites with 9 legitimate ones is a strange choice.
(Or maybe they need to work on how they present data in tables; tl;dr the supporting text.)
The false positive rate was 66% for "automatic scan" and 100% (!) for "deep scan".
In other words, you can get these numbers if your deep scan filter is isSuspicious() { return true; }.
I think there might be a confusion here? The 100% seems like the true positive rate (correct detection), not the false positive rate?
Brb, applying for YC funding for my new AI-based phishing detection system.
(‘return true’ is just a very optimized neural network after all!)