Immich can do part of this. For photos it does lm object detection and ocr for text. I think for video is currently only the first frame. It also has face / people detection.

And once set up it's easy to use even for non technical people.