Range interface for std.serialization

Johannes Pfau nospam at example.com
Thu Aug 22 08:33:06 PDT 2013


Am Wed, 21 Aug 2013 22:21:48 +0200
schrieb "Dicebot" <public at dicebot.lv>:

> 
> > Alternative AO2:
> >
> > Another idea is the archive is an output range, having this 
> > interface:
> >
> > auto archive = new XmlArchive!(char);
> > archive.writeTo(outputRange);
> >
> > auto serializer = new Serializer(archive);
> > serializer.serialize(new Object);
> >
> > Use the output range when the serialization is done.
> 
> I can't imagine a use case for this. Adding ranges just because 
> you can is not very good :)
> 


I'm kinda confused why nobody here sees the benefits of the output
range model. Most serialization libraries in other languages are
implemented like that. For example, .NET:

--------
IFormatter formatter = ...
Stream stream = new FileStream(...)
formatter.Serialize(stream, obj);
stream.Close();
--------

The reason is simple: In serialization it is not common to post-process
the serialized data as far as I know. Usually it's either written to a
file or sent over network which are perfect examples of Streams (or
output ranges). Common usage is like this:

-------
auto s = FileStream;
auto serializer = Serializer(s);
serializer.serialize(1);
serializer.serialize("Hello");
foreach(value;...)
    serializer.serialize(value);
-------

The classic way to efficiently implement this pattern is using an
OutputRange/Stream. Serialization must be capable of outputting many
100MBs to a file or network without significant memory overhead.




There are two specific ways how a InputRange interface can be useful: In
case the serializer works as a filter for another Range:
--------
auto serializer = new Serializer([1,2,3,4,5].take(3));
foreach(ubyte[] data; serializer)
--------
But InputRanges are limited to the same type for all elements, the
"serialize" call isn't. Of course you can use Variant. But what about
big structs? And performance matters so the InputRange approach only
works nicely if you serialize values of the same type.

The other way is if you only want to serialize one element:
--------
auto serializer = new Serializer(myobject);
foreach(ubyte[] data; serializer)
--------

It does not work well if you want to mix it with the "serialize" call:
-------
auto serializer = new Serializer();
serializer.serialize(1);
serializer.serialize("Hello");
serializer.serialize(3);
serializer.serialize(4);
foreach(ubyte[] data; serializer)
-------

Here the serializer has to cache data or the original objects until the
data is processed via foreach. If serializer had access to an output
range the "serialize" calls could directly write to the streams without
any caching. So the output-range model is clearly superior in this case.


More information about the Digitalmars-d mailing list