Hacker News

I read a lot of research papers for work. My workflow evolved around an ever-growing inbox of bookmarked papers from arXiv et al. Great for exploration, but hard to keep track of what I read.

Distillate bridges the tools I already use: Zotero (literature management), reMarkable (reader + highlighter), and Obsidian (notes). It automates the whole pipeline:

$ distillate

save to Zotero ──> auto-syncs to reMarkable

                        │

         read & highlight on tablet
         just move to Read/ when done

                        │

                        V

         auto-saves notes + highlights

It polls Zotero for new papers, uploads PDFs to the reMarkable via rmapi, then watches for papers you've finished reading in your Read folder. When it finds one, it:

- Parses .rm files using rmscene to extract highlighted text (GlyphRange items)

- Searches for that text in the original PDF using PyMuPDF and adds highlight annotations

- Enriches metadata from Semantic Scholar (publication date, venue, citations)

- Creates a structured markdown note with metadata, highlights grouped by page, and the annotated PDF (I keep mine in an Obsidian vault)

The core workflow just needs Zotero and a reMarkable — no paid APIs, no cloud backend, your notes stay on your machine. Optional extras if you plug them in:

- AI summaries via Claude (one-liner + key learnings from your highlights)

- Daily reading suggestions from your queue

- Weekly email digest via Resend

- Obsidian Bases database for tracking your reading

Stack: rmapi for reMarkable Cloud, rmscene for .rm parsing, PyMuPDF for PDF annotation. Python 3.10+, pip installable.

The trickiest part was highlight extraction: reMarkable stores highlighted text as GlyphRange items in a scene tree, and matching that text back to positions in the original PDF required fuzzy search with OCR cleanup, plus special merging logic for e.g. cross-page highlights. Happy to say it works well ~99% of the time now.

Install: pip install distillate && distillate --init

Code: https://github.com/rlacombe/distillate

Site: https://distillate.dev

I built this for myself but would love feedback, especially from other reMarkable + Zotero users. What's missing from your workflow? What else should I add?

mtrovo 18 hours ago [ - ]

I created something very similar but to display raindrop links into the remarkable and sync highlights back into raindrop. I also added a GenAI powered paper summary filter as a preface to the papers it send to the remarkable which is working quite well.

As you mentioned Remarkable file format is a PITA to extract highlights, one thing that helped a lot was to add an OCR fix phase that uses Gemini flash model to fix common OCR errors and to merge single highlights that are across pages.

rhl 17 hours ago [ - ]

Oh nice, that's a great idea! I'm exploring OCR of handwritten notes for future features, will give the Gemini pipeline a try.

rhl 20 hours ago [ - ]

Adding that I've worked on a CLI install flow which walks you through setting up Zotero, reMarkable, and key optional features as switfly as possible.

It leaves aside power user features (e.g. emails, GitHub Actions to sync when laptop is asleep, etc.), which are listed here: https://distillate.dev/power-users.html