RFC on range design for D2

Steven Schveighoffer schveiguy at yahoo.com
Wed Sep 10 14:17:33 PDT 2008


"Andrei Alexandrescu" wrote
> Steven Schveighoffer wrote:
>> "Andrei Alexandrescu" wrote
>>> Notice that "a range can't grow" is different from "a range can't be 
>>> assigned from a larger range". In particular, a range operation can 
>>> return a range larger than both input ranges. But not larger than their 
>>> union :o).
>>
>> Yes, so every time you add an element you have to update your forward 
>> range from the 'all' range so it includes the new element at the end. 
>> Every time you remove an element, you have to update your reverse range 
>> from the 'all' range so it excludes the element at the beginning. 
>> Failure to do this results in invalid ranges, which seems to me like a 
>> lot more work than simply not doing anything  (in the case of an 
>> iterator).  The pitfalls of using ranges for dynamically changing 
>> containers might outweigh the advantages that they provide in certain 
>> cases.
>
> No, this is incorrect. I don't "have to" at all. I could define the
> behavior of range as you mention, or I could render them undefined.
> Iterators invalidate anyway at the drop of a hat, so they're none the
> wiser. You can't transform a lack of an advantage into a disadvantage.

A range or iterator that becomes undefined when adding an element to a 
linked list or removing an element from a linked list (provided you don't 
remove the element in question) makes it useless for this type of purpose. 
What I want is a cursor that saves the position of an element, not the end 
and beginning.

Here is what I'm assuming a range consists of, and granted this is an 
assumption since I didn't look at any of your implementation, and a list 
object which uses ranges doesn't even exist AFAIK.  Assume that integers 
below are individual elements

all: 0 1 2 3 4 5 6 7 8 9 E
reverse range: 0 1 2 3 4
forward range: 5 6 7 8 9 E

Now I remove an element from the front:

all: 1 2 3 4 5 6 7 8 9 E
reverse range: ? 1 2 3 4
forward range: 5 6 7 8 9 E

I've now lost my reverse iterator because it's no longer valid, but I can 
reconstruct it by diffing the forward iterator and list.all.

If I add to the end I got a similar situation, I can reconstruct my forward 
iterator by diffing list.all and the reverse iterator.

Yes, it can be done, but it seems like more work than it's worth for this 
case.  The problem is, not only do I have to pay attention to what the end 
and beginning of the list are (as I would with an iterator), but I also have 
to pay attention to the same pieces in the ranges.  So ranges (in this 
implementation) have given me more work to do, and their still not safe 
because I could mistakenly use an invalid range.

> "Look at this pineapple. It's fresher than the other, and bigger too."
>
> "No, it's about as big. That pineapple sucks."

???

>
>>>> My belief is that ranges should be the form of input to algorithms, but 
>>>> iterators should be provided for using containers as general data 
>>>> structures.  Similar to how strings are represented by arrays/slices, 
>>>> but iterators (pointers) exist if you need them.
>>> If we agree it's better without iterators if not needed, we'd need a 
>>> strong case to add them. Right now I have a strong case against them.
>>
>> I don't need to worry about whether you have them or not, I can always 
>> implement them on my own ;)  Really, range/iterator support doesn't 
>> require direct support from the compiler (except for builtin arrays), and 
>> any improvements made to the compiler to support ranges (such as 
>> reference returns, etc) can be applied to iterators as well.
>>
>> I think ranges are an excellent representation when a range of elements 
>> is needed.  I think a cursor or iterator is an excellent representation 
>> when an individual position is needed.
>>
>>>> I'll probably move forward with this model in dcollections, I really 
>>>> like the range idea, and in general the view on how ranges are akin to 
>>>> slices. But I also like having access to iterators for other functions.
>>> Which functions?
>>
>> Functions which take or return a single position.  Such as 'erase the 
>> element at this position' or 'find the position of element x'.
>
> I agree. In fact I agreed in my original document, which I quote:
> ``Coding with ranges also has disadvatages. Some algorithms work
> naturally with individual iterators in the "middle" of a range. A
> range-based implementation would have to maintain a redundant range
> spanning e.g. from the beginning of the container to that middle.''
>
> However, I could meanigfully rewrite std.algorithm to work on ranges
> alone. The disadvantage does exist but is minor, For example, find does
> not return an iterator. It simply shrinks its input range until the
> element is found, or until it is empty. That way you can nicely use the
> result of find iteratively.

I totally agree with you that ranges are the way to go for std.algorithm.  I 
am not debating that.

But you have no example of how iterators and ranges compare for using 
non-array containers in situations BESIDES running std.algorithm.  I'm 
showing you an example, which happens to model after code I actually wrote 
and use, where iterators seem to be more suited for the task.

> Range find(alias pred = "a == b", Range, E)(Range haystack, E needle)
> {
>     alias binaryFun!(pred) test;
>     for (; !haystack.isEmpty; haystack.next)
>     {
>         if (test(haystack.first, needle)) break;
>     }
>     return haystack;
> }
>
> This is quite a few bites smaller than the previous version, which is
> now to be found in std.algorithm:
>
> Iterator!(Range) find(alias pred = "a == b", Range, E)(Range haystack, E
> needle)
> {
>     alias binaryFun!(pred) test;
>     for (auto i = begin(haystack); i != end(haystack); ++i)
>     {
>         if (test(*i, needle)) return i;
>     }
>     return end(haystack);
> }
>
> It has two less variables, and VERY importantly, one less type to deal
> with. Arguments aired against primitive ranges systematically omit this
> important simplification they bring. When you don't weigh in the
> advantages, of course all there is to be seen are the disadvantages.
>
> Better yet, when find does return, client code's in better shape because
> it doesn't need to compare the result against end(myrange). It can just
> test whether it's empty and be done with.

Unless myrange has changed since you called find.  In which case you have to 
run find again to get the range?

> So a newcomer to D2 would have to have an understanding of containers
> and ranges. Containers own data. They offer various traversals to crawl
> them in the form of ranges. Ranges are generalized slices.
>
> If iterators are thrown into the mix, things get more complex because
> iterators are a lower-level primitive, a generalized pointer. So the
> newcomer would have to ALSO understand iterators and deal with functions
> that require or return either. They'd have to learn how to pair
> iterators from ranges and how to extract iterators from ranges (more
> primitives). They'd also have to understand when it's better to hold on
> to a range (most of the time) versus a naked iterator (seldom and for a
> dubious convenience).
>
> I /understand/ there are advantages to iterators. Just don't forget the
> cost when discussing them.

I don't forget the cost.  I absolutely *100%* agree that ranges are a much 
better representation for std.algorithm.  i.e. when a RANGE OF VALUES is 
required.  When you want references SINGLE ELEMENTS that persist across 
container changes, I think the best implementation is a cursor/iterator.  I 
think they can both exist.  I think there is value to having a pointer to a 
single element without storing the boundaries with that pointer.


This is just like the const debate that I continue to have with you and 
Walter.  You want const for different reasons than for what I want const.  I 
want const for contracts, and you want it for pure functions.  You seem to 
dismiss anything that isn't in your realm of requirements as 'dubious' and 
'seldom used'.  Other people have requirements that are different than 
yours, and are just as valid.

> I am also sure that if I sat down long enough contemplating my navel I
> could come with more examples of iterators=good/ranges=bad.
> <snip>

Now you're just being rude :)  Please note that I'm not attacking you 
personally.  All I'm pointing out is that your solution solves certain 
problems VERY well, but leaves other problems not solved.  I think allowing 
iterators/cursors would solve all the problems.  I might be proven wrong, 
but certainly I don't think you've done that so far.  I'd love to be proven 
wrong, since I agree that iterators are generally unsafe.

-Steve 




More information about the Digitalmars-d-announce mailing list