Great! Was thinking about PP but because I only ran an order of magnitude fewer articles (under 1mm pages; by piggybacking on Dell's OCR) I relied on Arcanum ( https://www.arcanum.com/en/newspaper-segmentation/about/ ) which was cheap enough (but I think not cheap enough at your scale).

Cheers!

Hmm, I just tried to upload the jpgs of some of todays samples to Arcanum via https://www.arcanum.com/en/newspaper-segmentation/try-it/ and it didn't work. I'll try it again later, but it seems based on a cursory look that it wouldn't return info that I would need to correct it if I didn't like the output, and that I'd still have to stitch the individual pages back together myself?

Probably much cheaper than my process though...