Range Redesign: Copy Semantics

Mon Jan 22 20:22:07 UTC 2024

On Monday, 22 January 2024 at 04:38:13 UTC, Jonathan M Davis 
wrote:
> On Sunday, January 21, 2024 11:26:37 AM MST Sebastiaan Koppe
>> I have been thinking about having an explicit build and 
>> iteration phase, where priming happens when you switch from 
>> build to iteration.
>>
>> The benefit is that implementers have a clear place where to 
>> prime the range.
>
> Well, arguably, that's really an implementation detail of the 
> range, and it would be pretty annoying if range-based code had 
> to worry about explicitly priming a range (you're basically 
> getting into two phase construction at that point, which tends 
> to be problematic). Ideally, once you have a range, it's ready 
> to use. So, while I could see saying what best practice was on 
> that, I'm not sure that I'd want to require that it be done a 
> specific way.
>
> Part of the problem is that in my experience, it sometimes 
> makes sense to do it one place, whereas with other ranges, it 
> makes more sense to do it somewhere else.

Yes exactly. My point is that if there is an explicit place to do 
priming, that uncertainty would go away.

> That being said, I'm not sure that I've ever written a range 
> that primes anything in empty. Usually, the question is between 
> what goes in front and and what goes in popFront and how much 
> has to go in the constructor for that to work cleanly. And the 
> answer is not always the same.

With 2-phase ranges it would always be the same. Anyway, here are 
some Phobos examples:

- `filter` does its priming in empty, 
https://github.com/dlang/phobos/blob/bf35228426529ab19e5d17a3286d187214bf024a/std/algorithm/iteration.d#L1381

- `filterByDirectional` does a while loop in its constructor, 
https://github.com/dlang/phobos/blob/bf35228426529ab19e5d17a3286d187214bf024a/std/algorithm/iteration.d#L1581

- `chunkBy` calls empty in its constructor, 
https://github.com/dlang/phobos/blob/bf35228426529ab19e5d17a3286d187214bf024a/std/algorithm/iteration.d#L1931

- `substitute` calls empty and popFront in its constructor, 
https://github.com/dlang/phobos/blob/bf35228426529ab19e5d17a3286d187214bf024a/std/algorithm/iteration.d#L6909

Taken together, pretty much anything can happen just by 
constructing a range. You don't even need to iterate it!

> Regardless, the big issues that were the point of this thread 
> were the copy and assignment semantics for ranges and what we 
> would need to do fix those. There are definitely other issues 
> that we're going to have to sort out (e.g. the tail-const 
> issue).

Ultimately it all comes together or it doesn't.

>> What about:
>>
>> ```
>>      bool next(Callable)(Callable c) {
>>          if (empty)
>>              return false;
>>          c(front());
>>          popFront();
>>          return true;
>>      }
>> ```
>>
>> It has the benefit of not needing to unbox a Nullable/Pointer.
>
> Well, that would basically be another sort of opApply, which is 
> an idiom that tends to be pretty hard to understand in 
> comparison to the alternatives.

Actual usual would just use `foreach` of course. But in cases 
where you want to iterate manually you have to deal with the 
added ugliness/complexity, fair.

I do think its easier to access the item by `ref` this way, 
instead of having to do a pointer in some nullable wrapper.

> Getting a nullable type of some kind back is far easier to read 
> and understand, and the only real downside I see to it is that 
> you get bugs if someone tries to access the value when there 
> isn't one. But ranges already have that in general with front 
> and empty, and I don't think that it's been much of an issue in 
> practice.

I don't particularly like APIs that have a requirement to call 
methods in a particular order, I much rather have APIs that can't 
be used wrong. Pit of success and all that.

In practice though, I messed up only a few times, and then added 
a condition on `empty` and continued on.

> We could also go with next and hasNext like Alexandru 
> suggested, in which case, we wouldn't be returning a pointer or 
> nullable type if that's the concern. I'm not sure if that's 
> ultimately better or worse though. In some respects, it would 
> be better to be able to put all of the range logic in a single 
> function, but it would also be nice in some situations to be 
> able to ask whether a basic input range has any elements 
> without having to grab the first one if it's there.

With some sources, determining whether there is a next item 
actually involves *getting* the next item. Instead, like `empty`, 
I think `hasNext` ought to be `const`.