Hacker News

There's very good OCR models. Then it becomes a matter of which letter is which. In Latin script there's only 26 possibilities, and then there's numbers and symbols.

1) https://mistral.ai/news/mistral-ocr-3

gunalx 5 hours ago [ - ]

not as simple as just OCR and map though. Some letters want space above them some want to be placed lower.

take g and f and c for examples

g and f are about the same height but different ofsets, and c would look like a capital C if scaled to the same size as g and f. (we probably want to auto adjust scales to match more evenly unless the text is on a grid (in case removing the grid is the difficulty)

These are just the difficulty I found by trying to make a more automated input to fontforge.

Fnoord 2 hours ago [ - ]

Mistral OCR is OCR combined with LLM. In English, it as simple as 'just good OCR' though. Check the example on the webpage I linked. The screenshot doesn't show perfect handwriting. The (invisible) line also doesn't go straight.

FTA:

> Handwriting: Mistral OCR accurately interprets cursive, mixed-content annotations, and handwritten text layered over printed forms.

> Forms: Improved detection of boxes, labels, handwritten entries, and dense layouts. Works well on invoices, receipts, compliance forms, government documents, and such.

> Scanned & complex documents: Significantly more robust to compression artifacts, skew, distortion, low DPI, and background noise.

> Complex tables: Reconstructs table structures with headers, merged cells, multi-row blocks, and column hierarchies. Outputs HTML table tags with colspan/rowspan to fully preserve layout.

lovich 3 hours ago [ - ]

There are ligatures as well if you’re getting fancy