Hacker News

I lived this previously. The author is missing some important context.

Spray-and-Pray Algorithms

After AlexNet, dozens of companies rushed into medical imaging. They grabbed whatever data they could find, trained a model, then pushed it through the FDA’s broken clearance process. Most of these products failed in practice because they were junk. In mammography, only 2–3 companies actually built clinically useful products.

Products actually have to be useful.

There were two products in the space: CAD, and Triage. CAD is basically overlay on the screen as you read the case. Rads hated this because it was distracting and because the feature-engineering based CAD from the 80s-90s was demonstrated to be a failure. Users basically ignored "CADs."

Triage is when you prioritize cases (cancers to the top of the stack). This has little to no value because when you have a stack of 50 cases you have to do today, then why do you care about the order? There were some niche use cases but it was largely pointless. It could actually detrimental. The algotithm would put easy cancer cases on the top, so now the user would spend less time on the rest of the stack (where the harder cases would end up).

*Side note:* did you know that using CAD was a billable extra to insurance. Even through it was proven to not work, for years it remained reimbursable up until a few years ago.

Poor Validation Standards

Models collapsed in the real world because the FDA process is designed for drugs/hardware, not adaptive software. Validation typically = ~300 “golden” cases, labeled by 3 radiologists with majority vote arbitration. If 3 rads say it’s cancer, it’s cancer. If they disagree, it's not a good case for the study. This filtering ignores the hard cases (where readers disagree), which is exactly what models need to handle in the real world. Instead of 500K noisy real-world studies, you validate on a sanitized dataset. Companies learned how to “cheat” by over fitting to these toy datasets. You can explain this to regulators endlessly, but the bureaucracy only accepts the previously blessed process. Note: The previous process was defined by CAD, a product that was cleared in the 80s and shown to fail miserably in clinical use. This validation standard that demonstrated grand historical regulatory failure is the current standard that you MUST use for any devices that look like a CAD in mammography.

Politics Over Outcomes

We ran the largest multi-site prospective (15) trial in the space. Results: ~50% reduction in radiologist workload. Increased cancer detection rate. 10x lower cost per study. We even caught cancers missed in the standard workflow. Clinics still resisted adoption—because admitting missed cancers looked bad for their reputation. Bureaucratic EU healthcare systems preferred to avoid the embarrassment even through it was entirely internal.

I'll leave you with one particularly salient story. I was speaking to the head a large US hospital IT/Ops organization. We had a 30 minute conversation about how to avoid putting our software decision in the EMR/PACS so that they could avoid litigation risk. Not once did we ever talk about patient impact. Not Once...

Despite all that, our system caught cancers that would have been missed. Last I checked at least 104 women had their cancers detected by our software and are still walking around. That’s the real win, even if politics buried the broader impact.