This is not always easy. The models I tried were too helpful and rewrote too much instead of fixing simple typos. When I tried I ended up with huge prompts and I still found sentences where the LLM was too enthusiastic. I ended up applying regexes with common typos and accepted some residual errors. It might be better now, though. But since then I’ve moved to all-in-one solutions like Mathpix and Mistral-OCR which are quite good for my purpose.