Rendering is a different problem from understanding what's rendered.

If your PDF renders a part of the sentence at the beginning of the document, a part in the middle, and a part at the end, split between multiple sections, it's still rather trivial to render.

To parse and understand that this is the same sentence? A completely different matter.

Computers "don't understand" things. They process things, and what you're saying is called layoutinng which is a key part of PDF rendering. I do understand for someone unfamiliar with the internals of file formats, parsing, text shapping, and rendering in general, it all might seem like a blackmagic.

No one said it was as black magic. In the context of OCR and parsing PDFs to convert them to structured data and/or text, rendering is a completely different task from text extraction.

As people have pointed out many times in the discussion: https://news.ycombinator.com/item?id=44783004, https://news.ycombinator.com/item?id=44782930, https://news.ycombinator.com/item?id=44789733 etc.

You're wrong. There is nothing inherent in "rendering" that means "raster or pixels". You can render PDFs or any format into any format you want, including XML for example.

In fact, in majority of PDFs, a large part of rendering has to do with composing text.