Hacker News

No one said it was as black magic. In the context of OCR and parsing PDFs to convert them to structured data and/or text, rendering is a completely different task from text extraction.

As people have pointed out many times in the discussion: https://news.ycombinator.com/item?id=44783004, https://news.ycombinator.com/item?id=44782930, https://news.ycombinator.com/item?id=44789733 etc.