> Most tools can only diagnose abnormalities that are common in training data, and models often don’t work as well outside of their test conditions.

I don’t understand this. All data in existence is fodder for training. Barring the privacy issue which the article implies as orthogonal, training data and actual data are the same set of things.

Test conditions should be identical to real conditions. What in the world is the article saying. What actual differences are there?

The only issue I can think of is when there is LLM like language logic in looking at the results of a scan. Like for example the radiologist looks at 30 sections of the image and they all have different relationships with each other and those relationships end up influencing the outcome. But I doubt it’s like that, radiology should be much simpler than learning a foreign language.