Super cool. Working on a local AI tool specifically for document workflow automation (where context = screen/web/folders/files), and this could come in super useful. I do most of the PDF/DOCX/etc. parsing natively in Rust, but having a nice way to see the output without spinning up Word or Powerpoint is a huge leap.

Thanks for releasing publicly.

nice - did you write a custom parser for PDF/DOCX? we wrote one for XLSX after running into event loop issues with sheet JS

Using lopdf[1] for PDF parsing, rtf-parser[2] for RTF, calamine[3] for XLSX, and I'm sure you know that DOCX/PPTX/etc. is basically just a zip file of XML + text. The LLM cares about textual data (which just gets moderately cleaned up post-extraction), so I (thankfully) didn't have to deal with rendering. But showing a preview or end-result to a user would be a huge plus, so I can see myself using your library.

[1] https://github.com/J-F-Liu/lopdf

[2] https://github.com/d0rianb/rtf-parser

[3] https://github.com/tafia/calamine

What about rendering? That's the hard part.

we built a library @extend-ai/react-xlsx on top of it that renders the parsed contents onto a canvas

testing was mostly manual with a test corpus we generated. its not perfect but its pretty close for most files we've seen

For me, rendering was just a nice-to-have.

Sorry I meant to ask the author of Extend UI not you.