Yes this! We training it on a ton of diverse document images to learn reading order and layouts of documents :)

But you have to render the PDF to get an image, right? How do you go from PDF to raster?