std.serialization: pre-voting review / discussion

Tue Aug 20 03:51:24 PDT 2013

Am Tue, 20 Aug 2013 10:40:57 +0200
schrieb "ilya-stromberg" <ilya-stromberg-2009 at yandex.ru>:

> On Tuesday, 20 August 2013 at 03:42:48 UTC, Tyler Jameson Little 
> wrote:
> > On Monday, 19 August 2013 at 18:06:00 UTC, Johannes Pfau wrote:
> >> An important question regarding ranges for std.serialization 
> >> is whether
> >> we want it to work as an InputRange or if it should _take_ an
> >> OutputRange. So the question is
> >>
> >> -----------------
> >> auto archive = new Archive();
> >> Serializer(archive).serialize(object);
> >> //Archive takes OutputRange, writes to it
> >> archive.writeTo(OutputRange);
> >>
> >> vs
> >>
> >> auto archive = new Archive()
> >> Serializer(archive).serialize(object);
> >> //Archive implements InputRange for ubyte[]
> >> foreach(ubyte[] data; archive) {}
> >> -----------------
> >>
> >> I'd use the first approach as it should be simpler to 
> >> implement. The
> >> second approach would be useful if the ubyte[] elements were 
> >> processed
> >> via other ranges (map, take, ...). But as binary data is 
> >> usually
> >> not processed in this way but just stored to disk or sent over 
> >> network
> >> (basically streaming operations) the first approach should be 
> >> fine.
> >
> > +1 for the first way.
> 
> No, you are WRONG. InputRange is MORE flexible: it can be lazy or 
> eager. OutputRange is only eager. As we know, lazy ranges is 
> required if it's possible:
> 
> On Sunday, 18 August 2013 at 18:26:55 UTC, Dicebot wrote:
> > So as a review manager, I think voting should be delayed until 
> > API is ready to address lazy range-based work model. No actual 
> > implementation is required but
> >
> > 1) it should be possible to do it later without breaking user 
> > code
> > 2) library should not make an assumption about implementation 
> > being lazy or eager
> 
> We can use InputRange like this:
> 
> import std.file;
> auto archive = new Archive()
> Serializer(archive).serialize(object);
> //Archive implements InputRange for ubyte[]
> write("file", archive);

Yes, InputRange is more flexible, but it's also more difficult to
implement and less efficient:
What happens between the 'serialize' and the 'write' call? Archive
has to cache the data, either the original object or the final
produced data in an ubyte[] buffer. 

> 
> Another benefit: we can process InputRange. For example, if we 
> have
> ZipRange zip(InputRange)
> function, it's easy to compress data:
> write("file", zip(archive));
> 
> Another example: we would like to change output xml file and 
> filter some data (because we already have it). Or we would like 
> to transform output xml to the html web page. No problems:

Filtering is easier with an InputRange. "Zip-Streams" on the other
hand should be OutputRanges and therefore work fine with both
approaches.

> XmlRange transformXml(InputRange);
> write("file", transformXml(archive));
> 
> Ideas?

The question is are there real-world examples where this is useful. You
have to gauge the utility of this approach against it's more complicated
and less efficient implementation.