protocol for using InputRanges
Jonathan M Davis
jmdavisProg at gmx.com
Sun Mar 23 00:53:22 PDT 2014
On Saturday, March 22, 2014 17:50:34 Walter Bright wrote:
> It's become clear to me that we've underspecified what an InputRange is. The
> normal way to use it is:
>
> while (!r.empty) {
> auto e = r.front;
> ... do something with e ...
> r.popFront();
> }
>
> no argument there. But there are two issues:
>
> 1. If you know the range is not empty, is it allowed to call r.front without
> calling r.empty first?
>
> If this is true, extra logic will need to be added to r.front in many cases.
You definitely don't have to call empty before calling front if you know that
it's not empty. Both front and empty should normally be pure (or at least act
that way) and essentially act like variables. In most cases, it works best for
the work of the range to go in popFront. The exception is when you're dealing
with a random-access range, since then any element could be accessed, making
it so that you can't be doing the work in popFront. I think that we have a
general agreement on this based on previous discussions, though it's certainly
not unanimous.
> 2. Can r.front be called n times in a row? I.e. is calling front()
> destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.
If calling front were destructive, that would break a lot of code. It's
probably true that most range-based code should avoid calling front multiple
times (in case front has to do more work than just return the value as well as
to avoid copying the result if that happens on every call), though if front is
auto ref, it could be more efficient to call it multiple times. So, it's not
entirely clear-cut.
But again, front and empty should normally function as if they were variables.
They should be property functions and should be pure (or at least act like
they're pure). I'm sure that a _lot_ of code will break if that isn't
followed.
There are corner cases which can get a bit mucky though - e.g.
auto a = map!(to!string)(range);
In this case, front is pure, but it returns a new value each time (albeit a
value that's equal each time until popFront is called). And there's no real
way to fix that if the resulting range is random access (though if it weren't,
the work could go in popFront, which _would_ make it so that front always
returned the same result).
And there have been arguments over whether the result of front should be valid
after popFront has been called (i.e. whether it's transient or not). A lot of
code assumes that it will be, but we have some nasty exceptions (e.g.
std.stdio.ByLine) - typically because front's a buffer which gets reused.
IIRC, in those cases, Andrei favored saying that input ranges that weren't
forward ranges could have a transient front but that forward ranges couldn't
(which I tend to agree with, though I'd prefer that _no_ ranges have transient
fronts, since it can really cause problems - e.g. std.array.array not
working). I don't think that a consensus was reached on that though, since a
few folks really liked using transient fronts with more complicated ranges.
In general though, I think that most of us would agree that front and empty
should be treated as properties - i.e. as if they were variables - and that
they should have try to stick to those semantics as closely as possible.
Ranges that stray from that seriously risk not working with a lot of range-
based code.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list