Hacker News

Thanks kindly for this well done and brave introduction. There are few people these days who'd even recognize the bare ASCII 'Postscript' form of a PDF at first sight. First step is to unroll into ASCII of course and remove the first wrapper of Flate/ZIP,LZW,RLE. I recently teased Gemini for accepting .PDF and not .EPUB (chapterized html inna zip basically, with almost-guaranteed paragraph streams of UTF-8) and it lamented apologetically that its pdf support was opaque and library oriented. That was very human of it. Aside from a quick recap of the most likely LZW wrapper format, a deep dive into Lineariziation and reordering the objects by 'first use on page X' and writing them out again preceding each page would be a good pain project.

UglyToad is a good name for someone who likes pain. ;-)