Hacker News

I will try it out. Is this the correct library? - https://github.com/yobix-ai/extractous

I have used Gemini for OCR and it was indeed good. I also used GPT 3.5 and liked that too.

You could also try PageIndex OCR, the first long-context OCR model. Most current OCR tools process each page independently, which causes them to lose the document’s structure and produce markdown with incorrect heading levels. PageIndex OCR generates markdown with more accurate heading levels to better capture the document’s structure.

malshe 5 days ago [ - ]

Ok, thanks for sharing. I will take a look.