Computers "don't understand" things. They process things, and what you're saying is called layoutinng which is a key part of PDF rendering. I do understand for someone unfamiliar with the internals of file formats, parsing, text shapping, and rendering in general, it all might seem like a blackmagic.
No one said it was as black magic. In the context of OCR and parsing PDFs to convert them to structured data and/or text, rendering is a completely different task from text extraction.
As people have pointed out many times in the discussion: https://news.ycombinator.com/item?id=44783004, https://news.ycombinator.com/item?id=44782930, https://news.ycombinator.com/item?id=44789733 etc.
You're wrong. There is nothing inherent in "rendering" that means "raster or pixels". You can render PDFs or any format into any format you want, including XML for example.
In fact, in majority of PDFs, a large part of rendering has to do with composing text.