Hacker News

How does it do with multi-column text and headers and footers?

We have trained the model on tables with hierarchical column headers and with rowspan and colspan >1. So it should work fine. This is the reason we predict the table in HTML instead of markdown.

nehalem 16 days ago [ - ]

Thank you. I was rather thinking of magazine like layouts with columns of text and headers and footers on every page holding article title and page number.

souvik3333 16 days ago [ - ]

It should work there also. We have trained on research papers with two columns of text. Generally, papers have references as a footer and contains page number.