Range Redesign: Copy Semantics

Mon Jan 22 23:07:28 UTC 2024

On Sunday, January 21, 2024 10:22:44 PM MST Paul Backus via Digitalmars-d 
wrote:
> On Monday, 22 January 2024 at 03:52:02 UTC, Jonathan M Davis
>
> wrote:
> > On Sunday, January 21, 2024 6:50:26 AM MST Alexandru Ermicioi
> >
> > via Digitalmars- d wrote:
> >> Then new input range api can also be propagated to other types
> >> of range such as forward range and further.
> >
> > Part of the point here is to _not_ do that, because they
> > fundamentally have different semantics.

> If you split the API in two, by making input streams (basic input
> ranges) and forward ranges completely disjoint, you undermine
> this goal. Now, each data structure has to implement *two* APIs,
> and each algorithm has to be implemented *twice*, once for input
> streams and once for forward ranges.
>
> In practice, what's going to happen is library authors will
> simply not bother to implement both, and we will end up with gaps
> where many (data structure, algorithm) pairs are not covered.

In practice, basic input ranges don't work with the vast majority of
algorithms anyway. Some do (e.g. map), but pretty much anything that needs
to do more than a simple transformation on the elements ends up needing a
forward range. I really don't think that we're going to lose much by forcing
basic input ranges to be separate.

> > Restricting copying would make ranges borderline unusable. They
> > have to be able to be passed around and be wrappable, which
> > means either copying them or moving them, and moving them would
> > be very un-user-friendly, since it would require explicit calls
> > to move calls all over the place.
>
> Worth noting that generic range algorithms already have to
> account for the existence of non-copyable types, since even if a
> range itself is copyable, its elements may not be.

Do they? Non-copyable types do not work with ranges right now. They don't
work with foreach right now. Non-copyable types have their uses to be sure,
but they're also fairly niche. Realistically, very little range-based code
is going to work with them because of how restricted they are to deal with,
and unless you're explicitly testing code with them, you're not going to end
up writing code that works with them. We could bend over backwards to try to
make them work in Phobos, but no one else is going to do that unless they're
using non-copyable types in ranges themselves (which almost no one will be
doing even if we support it), and trying to support non-copyable types will
actively make range-base code worse in many cases, because it will require
calling front multiple times, and many range-based algorithms do real work
in front rather than simply returning a value (e.g. map works that way).

Granted, range-based code in general doesn't tend to be very disciplined
about whether it calls front once or several times, but forcing it to be
several times in order to support non-copyable types will have a real cost
for code that is _way_ more common than non-copyable types.

Personally, I think that we should make it very explicit that non-copyable
types are not supported by ranges. They don't work with them now, and I
don't think that it's even vaguely worth it to try to make ranges work with
them. The cost is too high for too little benefit.

> In light of the points above, my proposed copying semantics for
> input streams are:
>
> 1. Input streams MAY be non-copyable, but are not required to be.
> 2. If you copy an input stream and then call next() on the
> original, the behavior of both the original and the copy is
> unspecified.
>
> That is, I think we should give up on having 100% consistent
> copying semantics in this one case, in order to keep the overall
> API unified (by supporting UFCS next()) and avoid unnecessary
> pessimization of range algorithms.
>
> The good news is that with these semantics, if you write you
> generic code conservatively and treat all input streams as
> non-copyable, you'll get the right behavior in all cases. And if
> you don't, you'll get a compile-time error as soon as you
> actually try to pass in a non-copyable input stream, and you'll
> know exactly how to fix it. So this design is still an
> improvement on the status quo.

I really think that trying to support non-copyable types is not worth it -
either for the ranges themselves or for their elements. It's not something
that very many people are even going to think about, let alone do. So, if it
were supported, realistically, it would just be with the algorithms in
Phobos and with the code that the few people using such types wrote for
themselves. And even in Phobos, even if we wanted to support it,
realistically, it wouldn't work a lot of the time until the few people that
cared complained about it, because very few programmers would even think
about testisg with non-copyable types. It would be _far_ worse than the
situation we have now with code needing to call save to work correctly and
yet failing to call save all over the place, because most ranges don't need
it. Even with Phobos, we have had many bugs over the years due to stuff like
forgetting to call save or reusing a range, because you have to test with
way too many range types to catch all of the edge cases. IMHO, we need to be
trying to reduce the number of edge cases, not increasing them.

And it's not like non-copyable types have ever worked properly with ranges -
so not supporting them is not removing capabiltiies. It's just not adding
capabilities that a select few would like to have. And while I feel for
them, I really think that it's far too niche to support given how costly it
is to support.

- Jonathan M Davis