std.serialization: pre-voting review / discussion

Mon Aug 19 11:05:59 PDT 2013

Am Mon, 19 Aug 2013 16:21:44 +0200
schrieb "Tyler Jameson Little" <beatgammit at gmail.com>:

> On Monday, 19 August 2013 at 13:31:27 UTC, Jacob Carlborg wrote:
> > On 2013-08-19 15:03, Dicebot wrote:
> >
> >> Great! Are there any difficulties with the input?
> >
> > It just that I don't clearly know how the code will need to 
> > look like, and I'm not particular familiar with implementing 
> > range based code.
> 
> Maybe we need some kind of doc explaining the idiomatic usage of 
> ranges?
> 
> Personally, I'd like to do something like this:
> 
>      auto archive = new XmlArchive!(char); // create an XML archive
>      auto serializer = new Serializer(archive); // create the 
> serializer
>      serializer.serialize(foo);
> 
>      pipe(archive.out, someFile);

Your "pipe" function is the same as std.algorithm.copy(InputRange,
OutputRange) or std.range.put(OutputRange, InputRange);

An important question regarding ranges for std.serialization is whether
we want it to work as an InputRange or if it should _take_ an
OutputRange. So the question is

-----------------
auto archive = new Archive();
Serializer(archive).serialize(object);
//Archive takes OutputRange, writes to it
archive.writeTo(OutputRange);

vs

auto archive = new Archive()
Serializer(archive).serialize(object);
//Archive implements InputRange for ubyte[]
foreach(ubyte[] data; archive) {}
-----------------

I'd use the first approach as it should be simpler to implement. The
second approach would be useful if the ubyte[] elements were processed
via other ranges (map, take, ...). But as binary data is usually
not processed in this way but just stored to disk or sent over network
(basically streaming operations) the first approach should be fine.

The first approach has the additional benefit that we can easily do
streaming like this:
----------------
auto archive = new Archive(OutputRange);
//Immediately write the data to the output range
Serializer(archive).serialize([1,2,3]);
----------------

This is difficult to implement with the second approach as you somehow
have to interleave calls to serialize and reads to the InputRange
interface:
------------
Serializer(archive).serialize(1);
foreach(data; archive) {stdout.write(data);}
Serializer(archive).serialize(2);
foreach(data; archive) {stdout.write(data);}
------------
And it's still less efficient than approach 1 as it has to keep an
internal buffer.

Another point is that "serialize" in the above example could be
renamed to "put". This way Serializer would itself be an OutputRange
which allows stuff like [1,2,3,4,5].stride(2).take(2).copy(archive);

Then serialize could also accept InputRanges to allow this:
archive.serialize([1,2,3,4,5].stride(2).take(2));
However, this use case is already covered by using copy so it would just
be for convenience.