Thanks. The documentation could definitely be fleshed out with some more examples.

You'd likely want to always use that API (or layer something on top of it) unless you're in control of both ends and know they were built with the same toolchain & settings. One area where I've skipped over it is by writing a basic code gen tool (albeit unfinished as most personal projects) that generates the serialisation functions at compile-time from a very basic DSL that describes the network structures (of a game protocol I don't control). If it detects that the current toolchain is going to generate a binary-compatible struct layout and there aren't any variable length fields in there (no strings, basically), it'll generate a memcpy (via using get/put on the stream) rather than per-field (de)serialisation. If it can guarantee alignment of the buffer, which is a tougher requirement to meet, it'll give you a view directly into the network buffer so you effectively have zero-overhead deserialisation. Very much a work in progress but there's scope for making things quite efficient with just a few basic building blocks.

That code-gen would be fantastic. I have commercial applications for this, so I'll keep an eye on your space.