is std.algorithm.joiner lazy?

Jonathan M Davis via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Apr 7 18:14:11 PDT 2016


On Friday, April 08, 2016 00:30:05 Puming via Digitalmars-d-learn wrote:
> On Thursday, 7 April 2016 at 18:15:07 UTC, Jonathan M Davis wrote:
> > On Thursday, April 07, 2016 08:47:15 Puming via
> >
> > Digitalmars-d-learn wrote:
> >> On Thursday, 7 April 2016 at 08:27:23 UTC, Edwin van Leeuwen
> >>
> >> wrote:
> >> > On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
> >> >> On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van
> >> >> Leeuwen wrote:
> >> >>
> >> >> OK. Even if it consumes the first two elements, then why
> >> >> does it have to consume them AGAIN when actually used? If
> >> >> the function mkarray has side effects, it could lead to
> >> >> problems.
> >> >
> >> > After some testing it seems to get each element twice, calls
> >> > front on the MapResult twice, on each element. The first two
> >> > mkarray are both for first element, the second two for the
> >> > second. You can solve this by caching the front call with:
> >> >
> >> > xs.map!(x=>mkarray(x)).cache.joiner;
> >>
> >> Thanks! I added more elements to xs and checked that you are
> >> right.
> >>
> >> So EVERY element is accessed twice with joiner. Better add
> >> that to the docs, and note the use of cache.
> >
> > I would note that in general, it's not uncommon for an
> > algorithm to access front multiple times. So, this really isn't
> > a joiner-specific issue. If anything, it's map that should get
> > a note in its docs, not joiner. You really should just expect
> > front to be called multiple times. So, if that's a problem, use
> > cache. But joiner is not doing anything abnormal.
>
> But in the joiner docs, it says joiner is lazy. But accessing
> front multiple times is not true laziness. I think it better note
> that after the lazy part: "joiner is lazy, but it will access the
> front twice".
>
> If there are many other lazy functions behave like this, I
> suggest to make a new name for it, like 'semi-lazy', to be more
> accurate.
>
> Maybe its my fault, I didn't know what cache does before Edwin
> told me.
> So there is the solution, it just is not easy for newbies to find
> out because there is no direct link between these functions.

Lazy means that it's not going to consume the entire range when you call the
function. Rather, it's going to return a range that you can iterate over. It
may or may not process the first element before returning, depending on how
it works, and there's definitely nothing that says whether it's going to
access front multiple times or not before calling popFront. And accessing
front multiple times without calling popFront is _normal_ whether you're
dealing with a lazy range or an eager one. All that lazy means is that
you're getting a range from the function rather than it consuming the range
before returning.

So, whatever you do with a range, in general, you have to assume that an
algorithm might access front multiple times, and the implementation is free
to change so that it accesses it more times or fewer times, because the
range API says nothing about whether front is accessed multiple times or
not. front needs to return equal values every time that it's called before
popFront is called, but that doesn't mean that they have to be the same
objects, and it doesn't mean that there's any restriction on how many times
front is accessed before a call to popFront.

So, I see no reason for joiner to say anything in its docs about how many
times it accesses front. It's pretty much irrelevant to how ranges are
expected to work, and it could change. If it actually matters for what
you're doing, then you need to figure out how to rework your code so that it
doesn't matter whether front is accessed multiple times per call to popFront
or not. That's just part of working with ranges, though I can certainly
understand if you didn't realize that previously.
>
> There is another problem, map, cache, and joiner don't work when
> composed multiple times. I've submitted a bug,
> https://issues.dlang.org/show_bug.cgi?id=15891, can you confirm?

Well, given your example, I would strongly argue that you should write a
range that calls read in its constructor and in popFront rather (so that
calling front multiple times doesn't matter) rather than using map. While
map can theoretically be used the way that you're trying to use it, it's
really intended for converting an element using rather than doing stuff like
I/O in it. Also, if the range that you give map is random access (like an
array would be), then opIndex could be used to access random elements, which
_really_ wouldn't work with reading from a file. So, I think that map is
just plain a bad choice for what you're trying to do.

It's not obvious to me why your example is failing to compile - the problem
appears to be with cache specifically and has nothing to do with joiner -
and I am inclined to agree that there's a bug there (be it in cache or in
the compiler), but I really think that using map is a bad move for what
you're trying to do anyway - especially when you consider what will happen
if opIndex is used. I'd strongly encourage you to just write a range that
does what you need instead.

- Jonathan M Davis



More information about the Digitalmars-d-learn mailing list