There's very good OCR models. Then it becomes a matter of which letter is which. In Latin script there's only 26 possibilities, and then there's numbers and symbols.
There's very good OCR models. Then it becomes a matter of which letter is which. In Latin script there's only 26 possibilities, and then there's numbers and symbols.
not as simple as just OCR and map though. Some letters want space above them some want to be placed lower.
take g and f and c for examples
g and f are about the same height but different ofsets, and c would look like a capital C if scaled to the same size as g and f. (we probably want to auto adjust scales to match more evenly unless the text is on a grid (in case removing the grid is the difficulty)
These are just the difficulty I found by trying to make a more automated input to fontforge.
Mistral OCR is OCR combined with LLM. In English, it as simple as 'just good OCR' though. Check the example on the webpage I linked. The screenshot doesn't show perfect handwriting. The (invisible) line also doesn't go straight.
FTA:
> Handwriting: Mistral OCR accurately interprets cursive, mixed-content annotations, and handwritten text layered over printed forms.
> Forms: Improved detection of boxes, labels, handwritten entries, and dense layouts. Works well on invoices, receipts, compliance forms, government documents, and such.
> Scanned & complex documents: Significantly more robust to compression artifacts, skew, distortion, low DPI, and background noise.
> Complex tables: Reconstructs table structures with headers, merged cells, multi-row blocks, and column hierarchies. Outputs HTML table tags with colspan/rowspan to fully preserve layout.
There are ligatures as well if you’re getting fancy