True zero-copy is not achievable with Protobuf, you need something like FlatBuffers for that. What is presented here is more like a zero-allocations.
True zero-copy is not achievable with Protobuf, you need something like FlatBuffers for that. What is presented here is more like a zero-allocations.
I also find this misleading, and could be solved so easily by just explaining that of course varints need resolving and things will just happen lazily (presumably, I didn’t read the code) when they are requested to be read rather than eagerly.
Is this still true? New versions of protobuf allow codegen of `std::string_view` rather than `const std::string&` (which forces a copy) of `string` and `repeated byte` fields.
https://protobuf.dev/reference/cpp/string-view/
It allows avoiding allocations, but it doesn't allow using serialised data as a backing memory for an in-language type. Protobuf varints have to be decoded and written out somewhere. They cannot be lazily decoded efficiently either: order of fields in the serialised message is unspecified, hence it either need to iterate message over and over finding one on demand or build a map of offsets, which negates any wins zero-copy strives to achieve.
Those field accessors take and return string_view but they still copy. The official C++ library always owns the data internally and never aliases except in one niche use case: the field type is Cord, the input is large and meets some other criteria, and the caller had used kParseWithAliasing, which is undocumented.
To a very close approximation you can say that the official protobuf C++ library always copies and owns strings.
Well that is very disappointing news.
Even the decoder makes a copy even though it's returning a string_view? What's the point then.
I can understand encoders having to make copies, but not in a decoder.