Transient ranges

Mon May 30 09:57:29 PDT 2016

On Mon, May 30, 2016 at 09:26:35AM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 5/30/16 12:17 AM, Jonathan M Davis via Digitalmars-d wrote:
[...]
> > Having byLine not copy its buffer is fine. Having it be a range is
> > not.  Algorithms in general just do not play well with that
> > behavior, and I don't think that it's reasonable to expect them to.
> 
> I disagree. Most algorithms in std.algorithm are fine with transient
> ranges.

Yes, some years ago I submitted a series of PRs to fix poorly-written
code in Phobos that unnecessarily depended on .front not changing when
popFront is called.  I had found that in the majority of cases it was
actually unnecessary to require non-transience; most algorithms in
std.algorithm, properly-written, work just fine with transient ranges.

For the few algorithms that don't work with transient ranges, arguably
they ought to require forward ranges and use .save instead. The only
exception I can think of right now is array().

[...]
> Here is how I think about it: the front element is valid and stable
> until you call popFront. After that, anything goes for the old front.
> 
> This is entirely reasonable, and fits into many many algorithms. This
> isn't a functional-only language, mutation is valid in D.

I agree, most range algorithms work just fine with transient ranges.  I
consider it a reasonable consequence of defensive programming: don't
assume anything about the API beyond what it actually guarantees. The
current range API says that the current range element is accessible via
.front; from this one may conclude that the value returned by .front
should be valid immediately afterwards. However, the API says nothing
about this value lasting past the next call to .popFront, and in fact it
specifies that .front will return something different afterwards, so
defensively-written code should assume the worst and regard the previous
value of .front as invalidated.

Understandably, writing code this way is more constrained and less
convenient, but in a standard library I'd expect that code should be at
least of this calibre, if not better.  And as far as I have found, the
majority of range algorithms in Phobos *can* be written this way and
work perfectly fine with transient ranges.  As I've already said, most
of the remaining algorithms can be implemented by requiring forward
ranges and using .save instead of assuming that .front is cacheable --
.save is something guaranteed by the API, and therefore defensively
written generic code should use it rather than making unfounded
assumptions about .front.

[...]
> > If it's a range, then it can be passed around to other algorithms
> > with impunity, and almost nothing is written with the idea that a
> > range's front is transient.

And nothing about the range API guarantees that .front is *not*
transient either.  Generic code should not assume either way.  It should
be written with the minimal assumptions necessary to work.

[...]
> > There's no way to check for transience, and I don't think that it's
> > even vaguely worth adding yet another range primitive that has to be
> > checked for everywhere just for this case. Transience does _not_
> > play nicely with algorithms in general.

I disagree. Of the algorithms I surveyed in Phobos at the time, I found
that the majority of them can be written in such a way that they are
transience-agnostic.  Only a small set of algorithms actually require
non-transience.  That's hardly "not play nicely with algorithms in
general".

[...]
> > Pretty much no range-based code is written with the idea that front
> > is transient.
> 
> Provably false, see above.
[...]

I'd argue that range-based generic code that assumes non-transience is
inherently buggy, because generic code ought not to make any
assumptions beyond what the range API guarantees. Currently, the range
API does not guarantee non-transience, therefore code that assumes so is
broken by definition.  Just because they happen to work most of the time
does not change the fact that they're written wrongly.

T

-- 
Just because you can, doesn't mean you should.