Range Redesign: Copy Semantics

Mon Jan 22 05:22:44 UTC 2024

On Monday, 22 January 2024 at 03:52:02 UTC, Jonathan M Davis 
wrote:
> On Sunday, January 21, 2024 6:50:26 AM MST Alexandru Ermicioi 
> via Digitalmars- d wrote:
>> Then new input range api can also be propagated to other types 
>> of range such as forward range and further.
>
> Part of the point here is to _not_ do that, because they 
> fundamentally have different semantics.

I think maybe we're losing sight of the big picture in this 
discussion.

The big-picture, overarching goal of the range API is to provide 
a common interface between data structures and algorithms. You 
have N data structures and M algorithms, and by using this common 
interface you can implement all N*M combinations in O(N+M) code. 
Each data structure implements the range API once, and each 
algorithm is implemented once using the range API.

If you split the API in two, by making input streams (basic input 
ranges) and forward ranges completely disjoint, you undermine 
this goal. Now, each data structure has to implement *two* APIs, 
and each algorithm has to be implemented *twice*, once for input 
streams and once for forward ranges.

In practice, what's going to happen is library authors will 
simply not bother to implement both, and we will end up with gaps 
where many (data structure, algorithm) pairs are not covered.

I appreciate the desire to provide consistent copying semantics, 
but if it comes to a choice between that and having a single 
unified API, I will choose the unified API, warts and all.

> Restricting copying would make ranges borderline unusable. They 
> have to be able to be passed around and be wrappable, which 
> means either copying them or moving them, and moving them would 
> be very un-user-friendly, since it would require explicit calls 
> to move calls all over the place.

Worth noting that generic range algorithms already have to 
account for the existence of non-copyable types, since even if a 
range itself is copyable, its elements may not be.

Also, as I pointed out in an earlier reply to Walter [1], there 
are legitimate use-cases for non-copyable input streams, where 
making them copyable would incur a significant performance 
overhead.

I think the correct solution to this is something like DIP 1040, 
which makes non-copyable types less of a pain to work with in 
general.

[1] 
https://forum.dlang.org/post/sullaynnrjufookbuijt@forum.dlang.org

---

In light of the points above, my proposed copying semantics for 
input streams are:

1. Input streams MAY be non-copyable, but are not required to be.
2. If you copy an input stream and then call next() on the 
original, the behavior of both the original and the copy is 
unspecified.

That is, I think we should give up on having 100% consistent 
copying semantics in this one case, in order to keep the overall 
API unified (by supporting UFCS next()) and avoid unnecessary 
pessimization of range algorithms.

The good news is that with these semantics, if you write you 
generic code conservatively and treat all input streams as 
non-copyable, you'll get the right behavior in all cases. And if 
you don't, you'll get a compile-time error as soon as you 
actually try to pass in a non-copyable input stream, and you'll 
know exactly how to fix it. So this design is still an 
improvement on the status quo.