I'm confused about ranges (input and forward in practice, in algorithms)

Thu Aug 13 17:33:28 PDT 2015

Despite using them all the time, I'm suddenly confused about 
ranges...

My understanding is that (for library-worth code) algorithms that 
consume ranges are supposed to use .save() to be compatible with 
both ref type and value type ranges. What about the reverse? Are 
algorithms supposed to try to avoid effectively .save()ing ranges 
(when they are value types) by not copying range variables? 
Furthermore, is it incorrect for an input range that is also not 
a forward range (or at least does not declare itself formally as 
a forward range by having a save()) to allow itself to be bitwise 
copied? (i.e., to effectively provide a save() behaviour).

To make it a more concrete, consider the following:

     struct ValInputRange
     {
         int v;
         int front() { return v; }
         void popFront() { ++v; }
         bool empty = false;
     }

     class RefInputRange
     {
         int v;
         int front() { return v; }
         void popFront() { ++v; }
         bool empty = false;
     }

     void main()
     {
         auto vl = ValInputRange();
         auto rf = new RefInputRange();

         assert(vl.startsWith([0, 1]) == true);
         assert(vl.startsWith([0, 1]) == true);

         assert(rf.startsWith([0, 1]) == true);
         assert(rf.startsWith([0, 1]) == false);

         writeln(vl.take(3)); // [0, 1, 2]
         writeln(rf.take(3)); // [1, 2, 3]
     }

- is startsWith() supposed to be transparently usable with both 
val-type ranges and val-type ranges? Currently it provides 
different semantics for these, arguably, as in the example.

- startsWith accepts an InputRange. What does this mean in 
practice? I'm used (perhaps incorrectly) to thinking about a 
proper InputRange (one that isn't also a ForwardRange, etc.) as 
one that provides a single pass. But many algorithms, such as 
startsWith, have multi-pass semantics: despite not save()ing the 
range explicitly, since startsWith() receives the range by value, 
it is copied, and therefore a subsequent pass is available.

- If the current (dual) behaviour is desired, shouldn't 
startsWith at least document how it mutates the inputed range? I 
mean, to me it wasn't obvious that the last element wouldn't be 
popped after the comparison (it's slightly extra work, but if I'm 
OK with dropping the other elements then I probably don't want 
the last element to remain either), just as an example.

- If not, shouldn't it accept the range by ref?

- IMO, there *really* should be a small reference page on ranges 
that included all of these more subtle points. The most 
reference-style page on ranges is perhaps the documentation for 
std.ranges.interfaces and it doesn't discuss many of these kinds 
of points). BTW, I thought that after the discussion on the 
subject (due to Walter's parser work) it had been decided that it 
was necessary to call empty() before first calling front(), but 
that page says "Calling r.front is allowed only if calling 
r.empty has, *or would have*, returned false." (emphasis added). 
What's the current consensus?

- shouldn't InputRanges be more explicit about their implicit 
ForwardRange'ness or lack thereof? For instance, is it 100% 
obvious to everybody which of these versions will be correct? For 
all acceptable implementations of File and .byLine()?

     void main()
     {
         import std.file : write;
         import std.stdio : File;
         import std.algorithm;

         write("/tmp/test.txt", "line one\nline two\nline three");
         auto lines = File("/tmp/test.txt").byLine;

         // guess which...

         version(one)
         {
             assert(lines.startsWith(["line one", "line two"]) == 
true);
             assert(lines.startsWith(["line one", "line two"]) == 
true);
         }

         version(two)
         {
             assert(lines.startsWith(["line one", "line two"]) == 
true);
             assert(lines.startsWith(["line two", "line three"]) 
== true);
         }
     }