Super interesting. Would you be willing to try the Python package (https://github.com/jbarrow/commonforms) or share the PDFs?

For the non-ONNX models there are some inference tricks that generally improve performance, and potentially lowering confidence could help.