Range Redesign: Copy Semantics

Mon Jan 22 04:38:13 UTC 2024

On Sunday, January 21, 2024 11:26:37 AM MST Sebastiaan Koppe via Digitalmars-d 
wrote:
> On Sunday, 21 January 2024 at 05:00:31 UTC, Jonathan M Davis
>
> wrote:
> > I've been thinking about this for a while now, but with the
> > next version of Phobos which is in the early planning stages,
> > we really should do some redesigning of ranges.
>
> Thanks for the write up.
>
> Another issue I have encountered is around priming. Sometimes it
> happens in `empty`, sometimes in a private helper function and
> sometimes even in the constructor.
>
> If priming happens in `empty`, then it can't be `const`. Which is
> strange because you would expect `empty` not to mutate.
>
> I have been thinking about having an explicit build and iteration
> phase, where priming happens when you switch from build to
> iteration.
>
> The benefit is that implementers have a clear place where to
> prime the range.

Well, arguably, that's really an implementation detail of the range, and it
would be pretty annoying if range-based code had to worry about explicitly
priming a range (you're basically getting into two phase construction at
that point, which tends to be problematic). Ideally, once you have a range,
it's ready to use. So, while I could see saying what best practice was on
that, I'm not sure that I'd want to require that it be done a specific way.
Part of the problem is that in my experience, it sometimes makes sense to do
it one place, whereas with other ranges, it makes more sense to do it
somewhere else. And whether empty mutates really shouldn't matter so long as
it's not actually const (it has to be logically const to be sure, but that
doesn't mean that it needs to actually be const), though obviously, it can
become annoying to wrap a range when you want to make your empty const, and
you can't rely on the range your wrapping having empty be const. For better
or worse though, the current range API makes no requirements about const
(and I'm not sure if we want to add such requirements given how restrictive
D's const is). If we did require const anywhere on the range API though,
empty and length would be the obvious places.

That being said, I'm not sure that I've ever written a range that primes
anything in empty. Usually, the question is between what goes in front and
and what goes in popFront and how much has to go in the constructor for that
to work cleanly. And the answer is not always the same.

We may also want to rework how front and popFront work for forward ranges,
which could simplify the priming issue depending on what we did. Regardless
though, the priming issue is certainly something that we should be thinking
about. Even if we don't add anything to the API for it, it will be good to
keep it in mind for how it would affect the other range API functions for
any reworking of them that we might do.

Regardless, the big issues that were the point of this thread were the copy
and assignment semantics for ranges and what we would need to do fix those.
There are definitely other issues that we're going to have to sort out (e.g.
the tail-const issue).

> > What I would propose for that would be a single function
> >
> >    auto next();
> >
> > where next returns a nullable type where the value returned is
> > the next element in the range, with a null value being returned
> > if the range is empty.
>
> What about:
>
> ```
>      bool next(Callable)(Callable c) {
>          if (empty)
>              return false;
>          c(front());
>          popFront();
>          return true;
>      }
> ```
>
> It has the benefit of not needing to unbox a Nullable/Pointer.

Well, that would basically be another sort of opApply, which is an idiom
that tends to be pretty hard to understand in comparison to the
alternatives. Getting a nullable type of some kind back is far easier to
read and understand, and the only real downside I see to it is that you get
bugs if someone tries to access the value when there isn't one. But ranges
already have that in general with front and empty, and I don't think that
it's been much of an issue in practice.

We could also go with next and hasNext like Alexandru suggested, in which
case, we wouldn't be returning a pointer or nullable type if that's the
concern. I'm not sure if that's ultimately better or worse though. In some
respects, it would be better to be able to put all of the range logic in a
single function, but it would also be nice in some situations to be able to
ask whether a basic input range has any elements without having to grab the
first one if it's there.

Another consideration that comes to mind is that we might want to add a
function that explicitly skips elements (since in some cases, you can skip
work if you don't actually need the next element), and since next both pops
the element and returns it, we wouldn't have to worry about the state of
front in the process (whereas adding some kind of skip function to the
current range API would result in an invalid front).

- Jonathan M Davis