> There are so many use cases where a local, purpose-built model that's dependably good at one thing would really make a difference. But no one is going to throw a billion dollars to give us amazing dust removal, flawless scene segmentation, etc.

iPhones have models for text extraction and in-painting in the Photos App.

Both don’t have knobs to tune them, but, I think, they are decent for their intended audience (definitely not flawless, but I don’t think that exists anywhere, even if dropping the ‘local’ requirement)

For scene segmentation, iOS has models for detecting persons (https://developer.apple.com/documentation/Vision/segmenting-...).

It also has models for detecting faces, face features, body and hand poses, or for picking the ‘best’ selfie from a set.

(And dust removal is fairly niche compared to these, I think. Or do I overlook some common use case for it that many people want?)