RFC on range design for D2

Tue Sep 9 15:47:18 PDT 2008

On 2008-09-09 18:09:28 +0200, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> Fawzi Mohamed wrote:
>> It is a nice idea to redesign the iterator and range.
>> I think that the proposal is not bad, but I have some notes about it, 
>> and some things that I would have done differently.
>> 
>> 1) The simplest interface input (range) is just
>> bool isEmpty();
>> T next();
>> iterator(T) release();
> 
> Actually next is getNext, and release returns R (the range type).
> 
>> Thefirst two I fully agree on, the second one I suppose is there to 
>> allow resources to be released and possibly transfer the data to 
>> another iterator.. is it really needed?
> 
> Yes. Consider findAdjacent that finds two equal adjacent elements in a 
> collection:
> 
> Range findAdjacent(alias pred = "a == b", Range)(Range r)
> {
>      if (r.isEmpty) return r;
>      auto ahead = r;
>      ahead.next;
>      for (; !ahead.isEmpty; r.next, ahead.next)
>          if (binaryFun!(pred)(r.first, ahead.first)) return r;
>      }
>      return ahead;
> }
> 
> The whole implementation fundamentally rests on the notion that you can 
> copy a range into another, and that you can iterate the collection 
> independently with two distinct ranges. If that's not true, 
> findAdjacent will execute yielding nonsensical results.
> 
> Input iterators are not copyable. With an input iterator "auto ahead = 
> r;" will not compile. But they are movable. So you can relinquish 
> control from one iterator to the other.

ok I understand, indeed it is useful to have non copyable "unique" 
iterators, even if they are not the common iterators (actually I think 
it is potentially even more important for output iterators).

>> Now I would see this simplest thing (let me call it iterator) as the 
>> basic objects for foreach looping.
>> *all* things on which foreach loops should be iterators.
>> If an object is not a iterator there should be a standard way to 
>> convert it to one (.iterator for example).
>> So if the compiler gets something that is not a iterator it tries to 
>> see if .iterator is implemented and if it is it calls it and iterates 
>> on it.
>> This let many objects have a "default" iterator. Obviously an object 
>> could have other methods that return an iterator.
> 
> Fine. So instead of saying:
> 
> foreach (e; c.all) { ... }
> 
> you can say
> 
> foreach (e; c) { ... }
> 
> I think that's some dubious savings.

I think it is useful, but not absolutely necessary.

>> 2) All the methods with intersection of iterator in my opinion are 
>> difficult to memorize, and rarely used, I would scrap them.
>> Instead I would add the comparison operation 
>> .atSamePlace(iterator!(T)y) that would say if two iterators are at the 
>> same place. With it one gets back all the power of pointers, and with a 
>> syntax and use that are understandable.
> 
> But that comparison operation is not enough to implement anything of 
> substance. Try your hand at a few classic algorithms and you'll see.

are you sure? then a range is *exactly* equivalent to a STL iterator, 
only that it cannot go out of bounds:
// left1-left2:
while((!i1.isEmpty) && (!i1.atSamePlace(i2))){
  i1.next;
}
// left2-left1:
while((!i2.isEmpty) && (!i1.atSamePlace(i2))){
  i1.next;
}
// union 1-2
while((!i1.isEmpty) && (!(i1.atSamePlace(i2))){
  i1.next;
}
while(!i2.isEmpty){
  i2.next;
}
// union 2-1
...
// lower triangle
i1=c.all;
while(!i1.isEmpty){
  i2=c.all;
  while(!i2.isEmpty && !i2.atSamePlace(i1)){
    i2.next;
  }
well these are the operations that you can do on basically all 
iterators (and with wich you can define new iterators).
The one you propose need an underlying total order that can be 
efficiently checked, for example iterators on trees do not have 
necessarily this property, and then getting your kind of intersection 
can be difficult (and not faster than the operation using atSamePlace.

>> I understand the idea of covering all possibilities, if one wants it 
>> with .atSamePlace a template can easily construct all possible 
>> intersection iterators. Clearly calling recursively such a template is 
>> inefficient, but I would say the then one should use directly a pair of 
>> iterators (in the worst case one could make a specialization that 
>> implements it more efficiently for the types that support it).
>> 
>> 3) copying: I would let the user freely copy and duplicate iterators if needed.
> 
> I like freedom too. But that kind of freedom is incorrect for input iterators.

Now I realized it, thanks.

>> 4) input-output
>> I think that the operations proposed are sound, I like them
> 
> Then you got to accept the consequences :o).

yes :)

>> 5) hierarchy of iterators
>> I would classify the iterator also along another axis: size
>> infinite (stream) - finite (but unknown size) - bounded (finite and known size)
> 
> Distinguishing such things can be of advantage sometimes, and could be 
> added as a refinement to the five categories if shown useful.

well if an iterator knows its size, and you want to use it to 
initialize an array for example...

>> The other classification:
>> forward iterator (what I called iterator until now)

>> bidirectional range: I understand this, these are basically two 
>> iterators one from the beginning and the other from the end that are 
>> coupled together. I find it a little bit strange, I would just expect 
>> to have a pair of iterators... but I see that it might be useful
>> bidirectional iterator: this is a doubly linked list, I think that this 
>> class of iterators cannot easily be described just as a range, it often 
>> needs three points (start,end,actual_pos), I think has its place (and 
>> is not explicitly present in your list)
>> random_iterator: (this could be also called array type or linear indexed type).
> 
> I can't understand much of the above, sorry.

the only new thing is bidirectional iterator: an iterator that can go 
in both directions as extra iterator type (your bidirectional range is 
something different).
I think it is useful and I don't see the need to shoehorn it into a 
range. For me an iterator is an object that can generate a sequence by 
itself, so a range is an example of iterator, but I don't see the need 
to make each iterator a range.
As I said before a range also has a total ordering of the objects that 
can be easily checked, this is a special king of iterator for me, not 
the most general. Take two ranges of two linked lists, you cannot 
easily build your intersections because you don't know their relative 
order, and checking it is inefficient.

> 
>> So this is what "my" iterator/range would look like :)
> 
> I encourage you to realize your design. Before long you'll find 
> probably even more issues with it than I mentioned above, but you'll be 
> gained in being better equipped to find proper solutions.

I hope we converge toward a good solution ;)

Fawzi
> 
> 
> Andrei