Hacker News

Hey HN!

Last week, Joe Barrow released CommonForms [1], a set of open models for automatically detecting form fields in PDFs.

He trained two models, FFDNet-S and FFDNet-L, on a dataset of 55k documents. You can read more about his approach in the arXiv paper [2].

As someone who's been searching for reliable models to auto-detect form fields (one of the last hard problems in PDF form filling), I was seriously impressed by the quality of these models. I wanted to give them the attention and distribution they deserve, so I created a fully browser-based implementation that handles both detection and field addition.

My implementation relies on his models and onnx runtime web + some post-processing. I plan on publishing a small browser library to encapsulate it in the coming days to make it easier to deploy anywhere (currently you'd have to fork / copy my code)

Happy to answer any questions about the browser-based implementation!

Questions about the models themselves should be directed to Joe, who I believe is also on HN [3]

[1] https://github.com/jbarrow/commonforms [2] https://arxiv.org/abs/2509.16506 [3] https://news.ycombinator.com/user?id=jbarrow

jbarrow 2 days ago [ - ]

Hey, Benjamin, thanks for the attribution! Happy to field any questions HN users have.

It's really gratifying to see people building on the work, and I love that it's possible to do browser-side/on-device.

Shindi a day ago [ - ]

Tbh this model is extremely bad. I tried a couple of our medical form examples and it couldn't find almost any of the fields.

jbarrow a day ago [ - ]

Super interesting. Would you be willing to try the Python package (https://github.com/jbarrow/commonforms) or share the PDFs?

For the non-ONNX models there are some inference tricks that generally improve performance, and potentially lowering confidence could help.