Reporting back on this, here's some sample output from https://www.sidis.net/animate.pdf:

  THE ANIMATE
  AND THE INANIMATE

  WILLIAM JAMES SIDIS

  <img>A black-and-white illustration of a figure holding a book with the Latin phrase "ARTI et VERITATI" below it.</img>

  BOSTON

  RICHARD G. BADGER, PUBLISHER

  THE GORHAM PRESS

  Digitized by Google
I haven't see ANY errors in what it has done, which is quite impressive.

Here, it's doing tables of contents (I used a slightly different copy of the PDF than I linked to):

  <table>
    <tr>
      <td>Chapter</td>
      <td>Page</td>
    </tr>
    <tr>
      <td>PREFACE</td>
      <td>3</td>
    </tr>
    <tr>
      <td>I. THE REVERSE UNIVERSE</td>
      <td>9</td>
    </tr>
    <tr>
      <td>II. REVERSIBLE LAWS</td>
      <td>14</td>
    </tr>
Other than the fact it is ridiculously slow, this seems to be quite good at doing what it says it does.