Well, you clearly hasn't parsed a wide variety of pdfs. Because if you had, you had been exposed to pdfs that contain only images, or those that contain embedded text, but that embedded text is utter nonsense and doesn't match what is shown on the page when rendered.

And that is before we even get into text structure, because as everyone knows, reading text is easier if things like paragraphs, columns and tables are preserved in the output. And guess what, if you just use the parsing engine for that, then what you get out is a garbled mess.

If your rendering engine doesn't output what is shown, your engine is broken, and it can be broken whatever you render it into bitmap or structured data.