foreach() behavior on ranges

Thu Aug 26 11:59:52 UTC 2021

On Wednesday, 25 August 2021 at 19:51:36 UTC, H. S. Teoh wrote:
> What I understand from what Andrei has said in the past, is 
> that a range is merely a "view" into some underlying storage; 
> it is not responsible for the contents of that storage.  My 
> interpretation of this is that .save will only save the 
> *position* of the range, but it will not save the contents it 
> points to, so it will not (should not) deep-copy.

That definition is potentially misleading if we take into account 
that a range is not necessarily iterating over some underlying 
storage: ranges can also be defined by algorithmic processes.  
(Think e.g. iota, or pseudo-RNGs, or a range that iterates over 
the Fibonacci numbers.)

> However, if the range is implemented by a struct that contains 
> a reference to its iteration state, then yes, to satisfy the 
> definition of .save it should deep-copy this state.

Right.  And in the case of algorithmic ranges (rather than 
container-derived ranges), the state is always and only the 
iteration state.  And then as well as that there are ranges that 
are iterating over external IO, which in most cases can't be 
treated as forward ranges but in a few cases might be (e.g. 
saving the cursor position when iterating over a file's contents).

Arguably I think a lot of problems in the range design derive 
from not thinking through those distinctions in detail 
(external-IO-based vs. algorithmic vs. container-based), even 
though superficially those seem to map well to the input vs 
forward vs bidirectional vs random-access range distinctions.

That's also not taking into account edge cases, e.g. stuff like 
RandomShuffle or RandomSample: here one can in theory copy the 
"head" of the range but one arguably wants to avoid correlations 
in the output of the different copies (which can arise from at 
least 2 different sources: copying under-the-hood pseudo-random 
state of the sampling/shuffling algorithm itself, or copying the 
underlying pseudo-random number generator).  Except perhaps in 
the case where one wants to take advantage of the pseudo-random 
feature to reproduce those sequences ... but then one wants that 
to be a conscious programmer decision, not happening by accident 
under the hood of some library function.

(Rabbit hole, here we come.)

> Andrei has mentioned before that in retrospect, .save was a 
> design mistake.  The difference between an input range and a 
> forward range should have been keyed on whether the range type 
> has reference semantics (input range) or by-value semantics 
> (forward range).  But for various reasons, including the state 
> of the language at the time the range API was designed, the 
> .save route was chosen, and we're stuck with it unless Phobos 
> 2.0 comes into existence.
>
> Either way, though, the semantics of a forward range pretty 
> much dictates that whatever type a range has, if it claims to 
> be a forward range then .save must preserve whatever iteration 
> state it has at that point in time. If this requires 
> deep-copying some state referenced from a struct, then that's 
> what it takes to satisfy the API.  This may take the form of a 
> .save method that copies state, or a copy ctor that does the 
> same, or simply storing iteration state as PODs in the range 
> struct so that copying the struct equates to preserving the 
> iteration state.

Yes.  FWIW I agree that when _implementing_ a forward range one 
should probably make sure that copying by value and the `save` 
method produce the same results.

But as a _user_ of code implemented using the current range API, 
it might be a bad idea to assume that a 3rd party forward range 
implementation will necessarily guarantee that.