Proposed Changes to the Range API for Phobos v3

Dukc ajieskola at gmail.com
Sat May 18 20:48:51 UTC 2024


Jonathan M Davis kirjoitti 16.5.2024 klo 17.56:> This is a proposal for 
an updated input range API for Phobos v3.

Good effort writing all this up! I certainly like most of this. However, 
some questions / proposals I do have.

> 
> Some of the Problems with Input Ranges
> --------------------------------------
> 
> 6. The range API does not currently specify what happens if you copy front
> and then call popFront. For the vast majority of code, it is both assumed
> that you can copy front and that the copy of front will be unaffected by
> calling popFront. However, there _are_ ranges (such as std.stdio's ByLine)
> which will do things like use a dynamic array as a buffer which has its
> elements overwritten in popFront, resulting in the elements of the copy
> being mutated, since it's a slice of the same data. Such ranges are
> sometimes referred to as having a transient front, and they do not work with
> a lot of range-based code. As such, they either need to be disallowed, or we
> need a way to detect such behavior so that range-based code can check for
> it.

I think this one is beyond the scope of the range spec. Whether the 
buffer of `ByLine` gets copied on element copy is an issue about the 
element of the range, not about the range itself. If this needs fixing, 
the element type of `byLine` needs to be changed, not the range spec.

Otherwise `front` would need to always deep copy the element on each 
invocation, which would make any range of ranges horribly inefficient!
> 
> 9. const makes ranges useless. Realistically, a const range is not a range,
> because you can't call popFront, and you can't iterate through it
> (technically, it _is_ possible to have popFront be const in cases where the
> iteration state doesn't actually live in the range, and popFront isn't pure,
> but normal ranges do not work that way).

In principle it goes like: elements being const is a non-issue, it's 
just another element type. Range itself can be const, but has to be cast 
to non-const with const elements to be actually iterable.

In practice, no way! With some care, you can write a range that copes 
with const elements, but even that is surprisingly thorny since a struct 
having a const field runs into all sorts of issues.

But just try designing a range that can be cast between const and tail 
const! Bar for some niche ranges like TakeNone, it would be extremely 
hard if not impossible with the current language.

> 
> 10. While it's not exactly a problem with the range API, it would be nice if
> we didn't have to import std.range / std.range.primitives (or the Phobos v3
> equivalent) to use arrays as ranges.

Respectful disagree on this one. Ranges are (bar for the foreach range 
API) a Phobos concept, not a language concept. It's good to be able to 
pick whether functions in your module will digest ranges, and if, 
whether it's V2, V3 or some other API. I wish for no changes here.


> 
> Overview of Proposed Changes
> ----------------------------
> 
> 3. If a range is finite (that is, it's not statically known to be empty),
> and it can be default-initialized, then its init value must be empty. If it
> cannot be default-initialized, then it must define a function called
> emptyRange which returns an empty range of the same type. Phobos will then
> define a function called emptyRange which takes a finite range which can be
> default-initialized and returns the init value of that range.

I suggest a slight relaxation. A finite range must have an empty init 
value if it does not provide an explicit `emptyRange`. That is, any 
range, finite or infinite, may have any valid `.init` value, as long as 
it's `.emptyRange` (whether explicit or Phobos-provided) is actually empty.


> 4. All ranges must be either dynamic arrays or structs (and not pointers to
> structs either).

Maybe unions too? Not that it's often needed.


> 9. In order to deal with transient front, we will explicitly disallow it. It
> will be a requirement that if front is copied, the copy will be unaffected
> by popFront. Of course, this doesn't prevent other references to the same
> data from mutating it (e.g. if it's a reference type), but calling popFront
> will not mutate it.

I'd rather have this one as a recommendation as opposed to something 
other ranges are allowed to assume. That would probably turn some 
current uses of `recurrence` or `generate` to unspecified behaviour, 
silently. In the case of `ByLine`, I agree the proposed design is 
superior but I wouldn't want to declare the present design illegal either.

> 
> 11. Finite random-access ranges are required to implement opDollar, and
> their opIndex must work with $. Similarly, any ranges which implement
> slicing must implement opDollar, and slicing must work with $.
> 
> In most cases, this will just be an alias to length, but barring a language
> change that automatically treats length as opDollar (which has been
> discussed before but has never materialized and is somewhat controversial
> given types where it wouldn't make sense to treat length as opDollar), we
> have to require that opDollar be defined, or generic code won't be able to
> use $ with indexing or slicing. We probably would have required it years ago
> except that it would have broken code to add the requirement.

Why would any language changes be needed for that? Can't `.length` be a 
Phobos function for types that have `opDollar` but not their own `.length`?


> Of course, there will be other minor changes (e.g. I'll probably rename
> isForwardRange to isCopyableRange),

I would have somewhat high bar for renaming that but leaving the other 
range names alone. I understand the names input range, forward range, 
bidirectional range and random access range all come from C++ iterators 
that are named exactly the same. So maybe best to either stick with the 
standard we follow, or break away in full and reconsider the names of 
all of them.



More information about the Digitalmars-d mailing list