called copy constructor in foreach with ref on Range

Mon Jun 22 21:33:08 UTC 2020

On Mon, Jun 22, 2020 at 09:11:07PM +0000, Stanislav Blinov via Digitalmars-d-learn wrote:
> On Monday, 22 June 2020 at 20:51:37 UTC, Jonathan M Davis wrote:
[...]
> > And moving doesn't fix anything, since the original variable is
> > still there (just in its init state, which would still be invalid to
> > use in generic code and could outright crash in some cases if you
> > tried to use it - e.g. if it were a class reference, since it would
> > then be null).
> 
> Eh? A range in 'init' state should be an empty range. If you get a
> crash from that then there's a bug in that range's implementation, not
> in user code.

Don't be shocked when you find out how many Phobos ranges have .init
states that are invalid (e.g., non-empty, but .front and .popFront will
crash / return invalid values).

> > So, code that does a move could accidentally use the original range
> > after the move and have bugs just like code that copies the range
> > has bugs if the original is used after the copy has been made. So,
> > the rule of thumb is not that you should avoid copying ranges. It's
> > that once you've copied a range, you should then use only the copy
> > and not the original.
> 
> That is not true. Copying of forward ranges is absolutely fine. It's
> what the current `save()` primitive is supposed to do. It's the
> copying of input ranges should just be rejected, statically.

Jonathan is coming from the POV of generic code.  The problem with move
leaving the original range in its .init state isn't so much that it will
crash or anything (even though as you said that does indicate a flaw in
the range's implementation), but that the semantics of generic code
changes in subtle ways. For example:

	auto myGenericFunc(R)(R r) {
		...
		foreach (x; r) {
			doSomething(x);
		}
		if (!r.empty)
			doSomethingElse(r);
		...
	}

Suppose for argument's sake that the above foreach/if structure is an
essential part of whatever algorithm myGenericFunc is implementing. Now
there's a problem, because if R has array-like semantics, then the
algorithm will do one thing, but if R has reference-like or move
semantics, then the behaviour of the algorithm will be different, even
if both ranges represent the same sequence of input values.

Note that in real-life code, this problem can be far more subtle than a
blatant foreach loop and if statement like the above. For example,
consider a function that drops the first n elements of a range. Your
generic function might want to pop the first n elements then do
something else with the rest of the range.  Well, if you write it the
obvious way:

	auto myAlgo(R)(R r) {
		size_t n = ...;
		dropFirstN(r, n);
		... // do something else with r
	}

then you have a subtle bug, because the state of r after the call to
dropFirstN might be completely different depending on whether r behaves
like an array or like a by-reference or move type.

T

-- 
Truth, Sir, is a cow which will give [skeptics] no more milk, and so they are gone to milk the bull. -- Sam. Johnson