The best story I heard about machine learning and radiology was when folks were racing to try to detect COVID in lung X-rays.

As I recall, one group had fairly good success, but eventually someone figured out that their data set had images from a low-COVID hospital and a high-COVID hospital, and the lettering on the images used different fonts. The ML model was detecting the font, not the COVID.

[a bit of googling later...]

Here's a link to what I think was the debunking study: https://www.nature.com/articles/s42256-021-00338-7

If you're not at a university, try searching for "AI for radiographic COVID-19 detection selects shortcuts over signal" and you'll probably be able to find an open-access copy.

I remember a claim that someone was trying to use an ML model to detect COVID by analyzing the sound of the patient coughing.

I couldn't for the life of me understand how this was supposed to work. If the coughing of COVID patients (as opposed to patients with other respiratory illnesses) actually sounds meaningfully different in a statistically meaningful way (and why did they suppose that it would? Phlegm is phlegm, surely), surely a human listener would have been able to figure it out easily.

I don't see why that's a bad idea. If you can also use dogs to detect COVID[1], surely you can build a machine with some sensor that can do the same.

[1] https://academic.oup.com/pmj/article/98/1157/212/6958858?log...

That doesn't really follow. NN models have been able to pick up on noisier and more subtle patterns than humans for a long time, so this type of research is definitely worth a short in my opinion. The pattern might also not be noticeable to a human at all, e.g. "this linear combination of frequency values in the Fourier space exceeds a specific threshold".

Anecdotes like this are informative as far as they go, but they don't say anything at all about the technique itself. Like your story about the fonts used for labeling, essentially all of the drawbacks cited by the article come down to inadequate or inappropriate training methods and data. Fix that, which will not be hard from a purely-technical standpoint, and you will indeed be able to replace radiologists.

Sorry, but in the absence of general limiting principles that rule out such a scenario, that's how it's going to shake out. Visual models are too good at exactly this type of work.

The issue is that in medicine, much like automobiles, unexpected failure modes may be catastrophic to individual people. “Fixing” failure modes like the above comment is not difficult from a technical standpoint, that’s true, but you can only fix it once you’ve identified it, and at that point you may have a dead person/people. That’s why AI in medicine and self driving cars are so unlike AI for programming or writing and move comparatively at a snails pace.

Yet self-driving cars are already competitive with human drivers, safety-wise, given responsible engineering and deployment practices.

Like medicine, self-driving is more of a seemingly-unsolvable political problem than a seemingly-unsolvable technical one. It's not entirely clear how we'll get there from here, but it will be solved. Would you put money on humans still driving themselves around 25-50 years from now? I wouldn't.

These stories about AI failures are similar to calling for banning radiation therapy machines because of the Therac-25. We can point and laugh at things like the labeling screwup that pjdesno mentioned -- and we should! -- but such cases are not a sound basis for policymaking.

> Yet self-driving cars are already competitive with human drivers, safety-wise, given responsible engineering and deployment practices.

Are they? Self driving cars only operate in a much safer subset of conditions that humans do. They have remote operators who will take over if a situation arises outside of the normal operating parameters. That or they will just pull over and stop.

Telsa told everybody 10 years ago self driving cars were a reality.

Waymo claims to have it. Some hackernews comenters too, I started to belive those are Waymo employees or stock owners.

Apart from that I know nobody that has even use or even seen a self driving car.

Self-driving cars are not a thing so you can't say they are more realible than humans.

I've never been in a self-driving car myself, but your position verges on moon-landing denial. They most certainly do exist, and have for a while.

Yes, they still need human backup on occasion, usually to deal with illegal situations caused by other humans. That's definitely the hard part, since it can't be handwaved away as a "simple" technical problem.

AI in radiology faces no such challenges, other than legal and ethical access to training data and clinical trials. Which admittedly can't be handwaved away either.

[deleted]

When it weren't for the font it might be anomalies in the image taking or even in the encoder software. You can never really be sure, what exactly the ML is detecting.

Exactly. A marginally higher image ISO at one location vs a lower ISO at another could potentially have a similar effect, and it would be quite difficult to detect.

You can give it the same tests the human radiologists take in school.

They do take tests, don't they?

They don't all score 100% every time, do they?

The point here is that the radiologists has a concept of knowing which light patterns are sensible to draw conclusions from and which not, because the radiologist has a concept of real world 3D objects.

Sure. It's just not a valid point. Even if it's valid today, it won't be by next week.

Why not? That's what Grad-CAM is for right?

What if the ML takes the conclusion exactly from the right pixels, but the cause is a rasterization issue.

[dead]