Hmm. Yesterday I stuck a >100 page PDF into a Claude Project and asked Claude to reference a table in the middle of it (I gave page numbers) to generate machine readable text. I watched with some bafflement as Claude convinced itself that the PDF wasn’t a PDF, but then it managed to recover all on its own and generated 100% correct output. (Well, 100% correct in terms of reading the PDF - it did get a bit confused a few times following my instructions.)

PDF is a trash format