Review: std.msgpack

Robert Jacques sandford at jhu.edu
Sat Jun 19 10:08:07 PDT 2010


On Sat, 19 Jun 2010 00:02:28 -0400, Masahiro Nakagawa  
<repeatedly at gmail.com> wrote:

> On Fri, 18 Jun 2010 23:47:38 +0900, Robert Jacques <sandford at jhu.edu>  
> wrote:
>>
>> Second, a lot of the design and examples are very similar to the  
>> MsgPack reference implementations, which are all Apache licensed. Since  
>> some of these design decisions seem to not be 'D'-ish (more on that  
>> later) this raises questions on just how clean room the implementation  
>> was.
>
> The author of MessagePack allows this module to apply Boost Software  
> License.
>
>> Third, there's a lot of things that shouldn't be in the final module.
>> -Should be removed:
>> SimpleBuffer, VRefBuffer/vRefBuffer,
>> -Redundant with other parts of phobos:
>> BinaryFileWriter => LockingTextWriter, mp_Type/mp_Object/mp_KeyValue =>  
>> std.variant
>
> BinaryFileWriter removed.
>
> mp_Object is based on std.json.JSONValue.
> Variant can't have mp_pack method.
> Hmm..., I check Variant again.

Well, std.json is an incomplete, alpha library that's buggy and  
unreviewed. For example, the design is currently very C-ish with structs,  
enums, functions and exceptions all being independent top level constructs.

>> -Should be moved elsewhere
>> DeflateFilter/deflateFilter => zip/zlib
>
> I agree.
>
>> Forth, msgpack was designed to be a simple, wire level encoding of some  
>> higher structure. As opposed to JSON or XML, you are not going to be  
>> modifying it dynamically. The very concept that it would use buffers  
>> internally, particularly the existence of the "zero-copy" buffer,  
>> baffles me. (Not to mention it leaves one open to aliasing bugs.)  
>> MsgPack has to decorate every single type it stores, so there is  
>> nothing "zero-copy" about it. To that end, I think MsgPack/MsgUnPack  
>> should wrap an input or output range as appropriate and be able to be  
>> used by a general serialization library such as Orange.
>
> If Phobos has std.serializer like Orange(or Doost.Serialization?),
> I will implement MsgPackArchive.
> But general serializer usually is low-performance and lacks some  
> features.
> So when user wants performance, user will use Packer and Unpacker  
> directly.

I'd disagree, with regard to performance: there's no reason why a general  
(cycle free) serializer would be slower than your direct routines. A  
serializer's only job is to pack/unpack complex objects correctly: the key  
advantage is that polymorphic objects/struct only have to support one  
custom serialization routine/mixin. Also, separating physical-encoding  
 from logical-encoding allows for some very powerful idioms. You could, for  
example, add cycle detection/support to msgpack by having the serializer  
write in a micro-format on-top of msgpack. Another, more important example  
is letting the user decide between an array and a map representation for  
structs/objects. Using arrays for objects is extremely brittle. If say, a  
new hire refactors a class from A{ int x; int y;} to A{ int y; int x; }  
you now have an extremely hard to find and track down logic bug in your  
program. To say nothing about trying to upgrade an implementation. On the  
other hand, using maps for objects adds a decent amount of overhead to the  
total serialized length.

>> Other little things,
>> -Why isn't nil mapped to null?
>
> Where?

MsgPack has a nil type which I assume is equivalent to null. So why isn't  
it simply included in the object serializer (and arguably also in the  
array and map serializer as well)? For that matter, packNil, packTrue,  
packFalse, etc are all redundant with pack(T). Furthermore, stream,  
packArray, packMap and packRaw all should be internal routines. (Though, a  
rawPack(T)(T value) routine might not go amiss) (Returns: this to method  
chain. => Returns: self. i.e. for method chaining )

Getting back to pack(T), there are 18 of these methods, one of which has  
incorrect constraints ( pack(Types...) if (Types.length > 1) ) with four  
different pieces of documentation. Good, clean documentation is extremely  
important to a library, so this needs to be reduced to one documentation  
block and a handful of methods at most. I'd recommend using isNumeric and  
writing an isSerializable template.

Oh, as for packing real you'll need to use CustomFloat!(80) for cross  
platform inter-op. Also, it shouldn't be @system: there's nothing  
memory-unsafe about writing or reading them. Please don't abuse @system,  
etc for meanings other their intended meanings.

>> -Why isn't there any big warnings about serializing cycles?
>
> Oops. Current implementation can't deal with a circular reference.
> I added NOTE section to Packer's comment.
>
>
> Masahiro


More information about the Digitalmars-d mailing list