"we used a fuzzer to minimize a corpus of 10 million PDF files down to 4,200 without any loss of code coverage"

Did they need a fuzzer for that? They could've render them all and see what's exercised?

Its a set cover problem, and is NP-hard. 4100 of something probably runs nicely in your laptop in a reasonable amount of time.