Forward ranges in Phobos v2

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Nov 4 23:30:05 UTC 2021


On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 2021-11-04 12:39, Paul Backus wrote:
> > Again, this is the same distinction we already have between rvalue
> > `front` and lvalue `front`
> 
> That reminds me, we should drop that like a bad habit too :o).
> 
> Currently ranges have all sorts of weird, random genericity. Recalling
> from memory (perhaps/hopefully some of these have been fixed):

Yeah, we need to get rid of useless genericity, and also exactly what is
expected of range operations should be stated clearly and unambiguously
in the API docs.  The current range API suffers from insufficient
clarity, so many such cases went "under the radar" and inevitably ended
up being implemented when some kind soul decided that it would be nice
to support this or that niche case.


> - At least at some point `empty` did not have to return bool, just
> something convertible to bool. Like immutable(bool).

Yeah, .empty should return bool, and only bool.  Not immutable(bool),
not something that alias this to bool, none of that sort.

Also, the spec should specify precisely whether .empty must be a
function (and whether it should be a member function, a free function,
or both), or it's allowed to be a member variable.  Currently in my own
code I have a few cases where .empty is a variable rather than a
function. It hasn't run into any problems yet so far, but things like
this must be explicitly stated, otherwise somebody will inevitably write
code that assumes one way or the other, and break things for no good
reason.


> - For a while we had a lively discussion about length returning ulong
> instead of size_t (relevant on 32-bit).

Whichever way we decide, this should be specified clearly and not left
up to interpretation.


> - front could return pretty much what it damn well pleased, including
> qualified data, rvalues vs lvalues, noncopyable stuff, etc.

Yeah, this has been especially troublesome.  I think we should specify
exactly what type(s) and qualifier(s) are permitted to be returned from
.front.

Don't forget transient values returned by .front that are invalidated by
the next call to .popFront (e.g., std.stdio.File.byLine, which reuses
the line buffer).  The range API needs to explicitly state whether
.popFront is allowed to do this, and if it is allowed, range algorithms
that attempt to cache .front past the next invocation to .popFront must
be rewritten.  (This used to be a pretty big problem, but I think we've
fixed most of the cases in Phobos by now. But it still turns its ugly
head up every now and then in user code that makes wrong assumptions
about the lifetime of the value returned by .front.)


> - Thinking how inout interacts with everything ranges is just
> depressing.

inout is the source of all kinds of nastiness in the language. It's a
cute hack that works for the trivial cases, but once you combine it with
other language features it's a mess. Consider this:

	inout T myFunc(T)(inout T delegate(inout T t) dg, inout T u) {...}

Does inout apply to the return value of dg, dg itself, or both? How does
it interact with the inout on the function's return value?  How exactly
does inout on t interact with the delegate's inout return, and how do
they correlate with the inout of the outer function?  This is just one
of many cases of ambiguity; it's not hard to construct other examples.
In short, it's a mess.

And don't forget that inout behaves like const inside the function body,
but when passed as a template argument triggers a different
instantiation (template bloat).

And trying to work with inout in generic code where you have to deal
with arbitrary incoming type qualifiers is an exercise in pain.

I think we should just flat out *not* support inout in ranges.


> - I seem to recall there was at least one popFront that returned
> something meaningful. (Maybe that's not too disruptive.)

It should be mandated by spec to return void.


> Based on past experience we could and should simplify the range
> interface in places where genericity has little value and the
> implementation effort is high.

+1.

Plus, the *exact* expectations of the various range functions should be
spelled out in clear, unambiguous terms.  Such as ref or non-ref, const
or mutable, function or member variable (or free function), transient
.front or not, copyable or not, what exactly .popFront returns, etc..
There must be no room left for interpretation except where explicitly
allowed.  Leave any small detail unspecified, and we can almost be
guaranteed to be bitten by it later.

Best spell out the exact permitted function signatures and types with
list of allowed qualifiers to leave no room for misinterpretation.


T

-- 
I am not young enough to know everything. -- Oscar Wilde


More information about the Digitalmars-d mailing list