The actual screenshot isn’t sent, some hash is generated from the screenshot and compared against a library of known screenshots of ads/shows/etc for similarity.
Not super tough to pull off. I was experimenting with FAISS a while back and indexed screenshots of the entire Seinfeld series. I was able take an input screenshot (or Seinfeld meme, etc) and pinpoint the specific episode and approx timestamp it was from.
> The actual screenshot isn’t sent, some hash is generated from the screenshot and compared against a library of known screenshots of ads/shows/etc for similarity.
this is most likely the case, although there's nothing stopping them from uploading the original 4K screengrab in cases where there's no match to something in their database which would allow them to manually ID the content and add a hash or just scrape it for whatever info they can add to your dossier.
I thought that similar inputs do not give similar hashes..but apparently that is cryptographic hashing. Locality-Sensitive Hashing methods (e.g. Perceptual hashing[1]) makes similar inputs have similar hashes.
[1] https://en.wikipedia.org/wiki/Perceptual_hashing
Ah, bingo, yes!
I should have been more specific in my comment. Perceptual hashing allowsfor higher similarity scores between similar looking images.
Lots of cool techniques to experiment with. Highly recommend playing around if you’re interested.
I immediately did a little exploration for potential utility in neuroimaging analyses...not that anything was immediately obvious to me, but I love learning about things like this.