What does zero-overhead mean here?

Raw protocol, really. No marshaling, no conversions, none of the overhead from type management you get with modern Python, none of the turtles-all-the-way-down dependencies of NodeJS equivalents. I like it, although I would probably port it back to “lightweight” Python in about half the size :)

Interesting to see ppl caring about marshalling overhead when working with LLMs

Some of us still prize compute efficiency, especially those who have been using Python for a long time and are contemplating the new kinds of code patterns that have emerged from data science...