Proposed Changes to the Range API for Phobos v3

Sat May 18 20:21:01 UTC 2024

On Saturday, 18 May 2024 at 14:26:18 UTC, H. S. Teoh wrote:
> On Thu, May 16, 2024 at 08:56:55AM -0600, Jonathan M Davis via 
> Digitalmars-d wrote: [...]
>> 1. The easy one is that the range API functions for dynamic 
>> arrays will not treat arrays of characters as special. A 
>> dynamic array of char will be a range of char, and a dynamic 
>> array of wchar will be a range of wchar.
>> 
>> Any code that needs to decode will need to use the phobos v3 
>> replacement for std.utf's decode or decodeFront - or use 
>> foreach - to decode the code units to code points (and if it 
>> needs to switch encodings, then there will be whatever 
>> versions of byUTF, byChar, etc. that the replacement for 
>> std.utf will have).
>
> I thought we already have this?  std.string.byRepresentation, 
> std.uni.byCodePoint, std.uni.byGrapheme already fill this need.

Yes, all those would be present, except possibly 
`byRepresentation`.

I think the message is that `somestr.front` will not decode, you 
need to use `decodeFront` or whatnot.

>
>
> [...]
>> However, with infinite ranges, there is no such solution. If 
>> they cannot be default-initialized, then they either can't be 
>> ranges, or they would have to be finite ranges which would 
>> just never be empty if they're constructed at runtime (while 
>> doing something like the flag trick to make their init value 
>> empty). And it's certainly true that the range API doesn't 
>> (and can't) guarantee that finite ranges are truly finite, but 
>> it's still better if we can define infinite ranges that need 
>> to be constructed at runtime as infinite ranges, since then we 
>> can get the normal benefits that come from statically knowing 
>> that a range is infinite.
>
> Infinite ranges also have the peculiarity that slicing may 
> create a finite range, i.e., the underlying type changes. 
> That's another wrinkle to deal with.

The `hasSlicing` test is very hard to understand. But it looks 
like it either it's infinite, or the slice must be the same type 
as the range itself.

So I think we already are dealing with this, and I wouldn't 
expect it to change.

Clearly, `.init` is going to be the same type, and we can exploit 
that for emptying a range.

I wouldn't expect the rules for hasSlicing to change.

>
>
> [...]
>> 4. All ranges must be either dynamic arrays or structs (and not
>> pointers to structs either).
>

> Although, come to think of it, we could have a .byRef range 
> wrapper that encapsulates a pointer to the range so that 
> changes to iteration state would be preserved.  But then it 
> begs the question, why not just allow pointers in the first 
> place?  Why require jumping through extra hoops?

It is impossible to hook the copying of pointers. And in this 
case, in order to prevent undue aliasing, we *must prohibit* 
copying of input ranges (which pointers  to ranges are, even if 
the range itself is a forward range).

The answer, as you said, is to make a generic `byRef` range 
wrapper which is non-copyable (and hence, a proper input range).

The point is to remove the requirement for `save` and use the 
copyability to determine if a range is a forward range. This 
makes sense, since most people forget to call `save` anyway, and 
just count on copying being the same thing. We should just hook 
the thing that people use.

>> 11. Finite random-access ranges are required to implement 
>> opDollar, and their opIndex must work with $. Similarly, any 
>> ranges which implement slicing must implement opDollar, and 
>> slicing must work with $.
>> 
>> In most cases, this will just be an alias to length, but 
>> barring a language change that automatically treats length as 
>> opDollar (which has been discussed before but has never 
>> materialized and is somewhat controversial given types where 
>> it wouldn't make sense to treat length as opDollar), we have 
>> to require that opDollar be defined, or generic code won't be 
>> able to use $ with indexing or slicing. We probably would have 
>> required it years ago except that it would have broken code to 
>> add the requirement.
> [...]
>
> Will this also require implementing arithmetic operators on the 
> return
> type of opDollar? Otherwise things like r[0 .. $-1] still 
> wouldn't
> work correctly. Or r[0 .. complicatedMathFunc(($-1)/2)].

It appears that the intention in current range code (in 
`hasSlicing` at least) is that `$` must be `size_t`. But I'm not 
sure.

But if that is the intention, obviously size_t has well-defined 
behavior.

-Steve