Hacker News

The raster image problem is real but there's a middle ground between "never invert" and a full NN classifier.

The author already computes BT.601 brightness per page. You can run the same calculation per-image bounding box instead of per-page, then add a bimodal pixel distribution check: if a raster image has most pixels near black or near white with few midtones, it's probably a line diagram or screenshot, not a photograph. That heuristic catches the main false-positive case (black-line diagrams on white backgrounds) with maybe 20 lines of image processing code.

It won't be perfect, and gwern's point stands that a proper trained classifier would be more accurate. But for a PDF viewer where you're already parsing content streams to get image coordinates, it's a lot cheaper than shipping a model and handles 80% of the problematic cases. The remaining edge cases (medical scans, thermal images) are rare enough that the per-page toggle is reasonable fallback.

gwern 3 hours ago [ - ]

FWIW, we did consider a histogram heuristic, and I believe GreaterWrong still uses one rather than InvertOrNot.com. But I regularly saw images on GW where the heuristic got it wrong but ION got it right, so the accuracy gap was meaningful; and that's why we went for ION rather than port over the histogram heuristic.