Seems a bit scary that the "source" text from the pdfs could actually be hallucinated.

Given that input is image and not raw pdf, its not completely unexpected