Range Redesign: Empty Ranges

Thu Mar 7 18:32:50 UTC 2024

On Wednesday, 6 March 2024 at 17:38:56 UTC, Paul Backus wrote:
> On Wednesday, 6 March 2024 at 16:47:02 UTC, Steven 
> Schveighoffer wrote:
>> On Wednesday, 6 March 2024 at 14:18:50 UTC, Paul Backus wrote:
>>> On Monday, 4 March 2024 at 21:29:40 UTC, Jonathan M Davis 
>>> wrote:
>>>> 2. The range API provides no way (other than fully iterating 
>>>> through a range) to get an empty range of the same type from 
>>>> a range unless the range is a random-access range.
>>>
>>> Genuine question: what are the use-cases for this?
>>>
>>> In general, the capabilities of ranges are designed to serve 
>>> the needs of algorithms. Input ranges exist because 
>>> single-pass iteration is all that's needed to implement 
>>> algorithms like map, filter, and reduce. Random-access ranges 
>>> exist because they're needed for sorting algorithms. And so 
>>> on.
>>>
>>> I'm not aware of any algorithm, or class of algorithm, that 
>>> needs this specific capability you're describing. If such 
>>> algorithms exist, we should be using them to guide our design 
>>> here. If they don't...then maybe this isn't really a problem 
>>> at all.
>>
>> When you need to pass a specific range type, and you want to 
>> pass in an empty range of that type, how do you do it?
>
> By "specific range type", do you mean a specific category of 
> range (input, forward, random-access, etc.), or a specific 
> concrete type?
>
> If the former, this is already easy to do with existing 
> language and library features (for example, in many cases you 
> can use an empty slice).
>
> If the latter, then the question I'm asking is, why do you need 
> to do that?

I meant the latter. A concrete range type.

How does it happen? Consider that an array is a concrete type 
that people use all the time. Why should any other range be 
different? I've definitely stored ranges and other things as type 
members that were voldemort types.

This ability is more of a question of "do we want to add this 
feature to ranges or not?" The feature doesn't currently exist -- 
you can't assume that an uninitialized range is empty.

Sometimes, we are bitten by the fact that the array is the most 
common range, and behaves in a specific way. People depend on 
that mechanism without realizing it, and then sometime later, 
they decide to change the type to one that is very compatible 
with arrays, but offers some benefit (i.e. to remove an 
allocation). However, the new range type might behave in 
unexpected ways, *but still compiles*.

A few examples I can think of:

1. copying an array is equivalent to `arr.save`, but may not be 
the case for other forward ranges.
2. character arrays have a mechanism to decode into dchar if you 
specify `dchar` as the loop variable type.
3. arrays have a default value of an empty array.

Reducing surprises when you substitute what seems like a 
"compatible" type is desirable, but not strictly necessary. I.e. 
I'm also OK if we don't try and add these definitions.

For the empty array case, I think the mechanism is trivial to add 
as a formal requirement, since nearly all ranges default to 
empty, and the ones that don't are probably easy to change. Maybe 
this isn't the case? I don't know. I think reducing friction when 
it's easy to do is something we should always be looking at.

>> The only tricky aspect is ranges that are references 
>> (classes/pointers). Neither of those to me should be supported 
>> IMO, you can always wrap such a thing in a range harness.
>
> The main thing you lose by dropping support for reference-type 
> ranges is interfaces. In particular, the interface inheritance 
> hierarchy in `std.range.interfaces`, where `ForwardRange` 
> inherits from `InputRange` and so on, cannot really be 
> replicated using `structs` (`alias this` only goes so far).

As mentioned, you can wrap these interfaces into structs, which 
then have better lifetime tracking capabilities.

-Steve