MessagePack for D released

Sun Apr 25 09:29:07 PDT 2010

On 2010-04-25 08:20:17 -0400, "Masahiro Nakagawa" <repeatedly at gmail.com> said:

> I release a serialization library for Phobos(D2).
> 
> Project repository: http://www.bitbucket.org/repeatedly/msgpack4d
> 
> MessagePack is a binary-based serialization spec.
> See official site for details: http://msgpack.sourceforge.net/
> Some application replace JSON with MessagePack for performance improvement.
> 
> msgpack4d ver 0.1.0 has an equal features with reference implementation.
>   * Zero copy serialization / deserialization
>   * Stream deserializer
>   * Support some D features(Range, Tuple)
> 
> Currently, Phobos doesn't have a real serialization module(std.json 
> lacks  some features)
> I hope Phobos adopts this library for serialization(std.msgpack or  
> std.serialization?).

Looks well done. There's one thing I'd suggest though. I'm pretty sure 
you could make it even faster by skipping the mp_Object intermediary 
representation and using templates. I know it's possible since I've 
done it for a surprisingly similar serialization library I'm working on.

The trick is to reuse the same pattern in the unpacker as you're 
already using in the packer. For instance, the packer has this function:

    ref Packer pack(T)(in T value) if (is(Unqual!T == long))

so the unpacker could have this function (just changed 'in' by 'out'):

    ref Unpacker unpack(T)(out T value) if (is(Unqual!T == long))

My library works by unserializing everything directly a the right place 
in a data structure while it parses the stream. Looks like this:

	MyStruct original;
	Archiver archiver;
	archiver.encode(original);
	immutable(byte)[] data = archiver.outout;

	MyStruct copy;
	Unarchiver unarchiver;
	unarchiver.input = data
	unarchiver.decode(copy);

This is unlike mp_Object which is in itself an intermediary 
representation that sits between the serialized data and the data 
structure you actually want to rebuild. I still have something similar 
to mp_Object as a convenience for types that prefer to implement a 
custom unserialization process in an order not dictated by the input 
stream, but this is less efficient:

	void decode(ref KeyUnarchiver archive) {
		archive.decode("var1", var1);
		archive.decode("var2", var2);
	}

What I'm trying to put to work now is a way to deal with multiple 
references to the same object. I'd also like a nice way to deal with 
Variant, but I'm under the impression this won't be possible without 
adding serialization support directly into Variant, or into TypeInfo.

Masahiro, sorry: this started as a useful commentary on your 
unserializer's approach and I ended up instead promoting what I am 
doing. Your library seems targeted at making a MessagePack serializer, 
with an emphasis on having a simple and portable serialization format, 
which is great when you want to communicate in this format. But on my 
side, I care more about being able of recreating object graphs and 
reinstantiating objects of the correct class when unserializing. That 
does not seem possible with your library, and MessagePack doesn't 
support this so it doesn't seem likely it can be added easily, am I 
right?

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/