Because that is literally happening. I did a bit of work developing some radiological models and sample size for healthy vs malignant is usually 4 to 1. Then you modify the error function so that it makes malignants more significant (you are quite often working with datasets as low as 500 images, so 80/20 training validation split means you are left with 80 examples of malignant) which means that as soon as you take a realistic sample where one specific condition maybe appears in 1/100 or 1/1000 the false positives make your model practically useless.

Of course SOTA models are much better, but getting medical data is quite difficult and expensive so there is not a lot of them.