Understandable. I work in academic publishing, and while the XML is everywhere crowd is graying, retiring, or even dying :( it still remains an excellent option for document markup. Additionally, a lot of government data produced in the US and EU make heavy use of XML technologies. I imagine they could be an interested consumer of Nanonets-OCR. TEI could be a good choice as well tested and developed conversions exist to other popular, less structured, formats.
MyST Markdown (the MD flavour, not the same-named Document Engine) was inspired by ReST. It was created to address the main pain-point of ReST for incoming users (it's not Markdown!).
As a project, the tooling to parse MyST Markdown was built on top of Sphinx, which primarily expects ReST as input. Now, I would not be surprised if most _new_ Sphinx users are using MyST Markdown (but I have no data there!)
Subsequently, the Jupyter Book project that built those tools has pivoted to building a new document engine that's better focused on the use-cases of our audience and leaning into modern tooling.
Understandable. I work in academic publishing, and while the XML is everywhere crowd is graying, retiring, or even dying :( it still remains an excellent option for document markup. Additionally, a lot of government data produced in the US and EU make heavy use of XML technologies. I imagine they could be an interested consumer of Nanonets-OCR. TEI could be a good choice as well tested and developed conversions exist to other popular, less structured, formats.
Do check out MyST Markdown (https://mystmd.org)! Academic publishing is a space that MyST is being used, such as https://www.elementalmicroscopy.com/ via Curvenote.
(I'm a MyST contributor)
Do you know why myst got traction, instead of RST which seems to have all the custom tagging and extensibility build in from the beginning?
MyST Markdown (the MD flavour, not the same-named Document Engine) was inspired by ReST. It was created to address the main pain-point of ReST for incoming users (it's not Markdown!).
As a project, the tooling to parse MyST Markdown was built on top of Sphinx, which primarily expects ReST as input. Now, I would not be surprised if most _new_ Sphinx users are using MyST Markdown (but I have no data there!)
Subsequently, the Jupyter Book project that built those tools has pivoted to building a new document engine that's better focused on the use-cases of our audience and leaning into modern tooling.
maybe even epub, which is xhtml