RFC on range design for D2

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Sep 9 09:09:28 PDT 2008


Fawzi Mohamed wrote:
> It is a nice idea to redesign the iterator and range.
> I think that the proposal is not bad, but I have some notes about it, 
> and some things that I would have done differently.
> 
> 1) The simplest interface input (range) is just
> bool isEmpty();
> T next();
> iterator(T) release();

Actually next is getNext, and release returns R (the range type).

> Thefirst two I fully agree on, the second one I suppose is there to 
> allow resources to be released and possibly transfer the data to another 
> iterator.. is it really needed?

Yes. Consider findAdjacent that finds two equal adjacent elements in a 
collection:

Range findAdjacent(alias pred = "a == b", Range)(Range r)
{
     if (r.isEmpty) return r;
     auto ahead = r;
     ahead.next;
     for (; !ahead.isEmpty; r.next, ahead.next)
         if (binaryFun!(pred)(r.first, ahead.first)) return r;
     }
     return ahead;
}

The whole implementation fundamentally rests on the notion that you can 
copy a range into another, and that you can iterate the collection 
independently with two distinct ranges. If that's not true, findAdjacent 
will execute yielding nonsensical results.

Input iterators are not copyable. With an input iterator "auto ahead = 
r;" will not compile. But they are movable. So you can relinquish 
control from one iterator to the other.

> Now I would see this simplest thing (let me call it iterator) as the 
> basic objects for foreach looping.
> *all* things on which foreach loops should be iterators.
> If an object is not a iterator there should be a standard way to convert 
> it to one (.iterator for example).
> So if the compiler gets something that is not a iterator it tries to see 
> if .iterator is implemented and if it is it calls it and iterates on it.
> This let many objects have a "default" iterator. Obviously an object 
> could have other methods that return an iterator.

Fine. So instead of saying:

foreach (e; c.all) { ... }

you can say

foreach (e; c) { ... }

I think that's some dubious savings.

> 2) All the methods with intersection of iterator in my opinion are 
> difficult to memorize, and rarely used, I would scrap them.
> Instead I would add the comparison operation .atSamePlace(iterator!(T)y) 
> that would say if two iterators are at the same place. With it one gets 
> back all the power of pointers, and with a syntax and use that are 
> understandable.

But that comparison operation is not enough to implement anything of 
substance. Try your hand at a few classic algorithms and you'll see.

> I understand the idea of covering all possibilities, if one wants it 
> with .atSamePlace a template can easily construct all possible 
> intersection iterators. Clearly calling recursively such a template is 
> inefficient, but I would say the then one should use directly a pair of 
> iterators (in the worst case one could make a specialization that 
> implements it more efficiently for the types that support it).
> 
> 3) copying: I would let the user freely copy and duplicate iterators if 
> needed.

I like freedom too. But that kind of freedom is incorrect for input 
iterators.

> 4) input-output
> I think that the operations proposed are sound, I like them

Then you got to accept the consequences :o).

> 5) hierarchy of iterators
> I would classify the iterator also along another axis: size
> infinite (stream) - finite (but unknown size) - bounded (finite and 
> known size)

Distinguishing such things can be of advantage sometimes, and could be 
added as a refinement to the five categories if shown useful.

> The other classification:
> forward iterator (what I called iterator until now)
> bidirectional range: I understand this, these are basically two 
> iterators one from the beginning and the other from the end that are 
> coupled together. I find it a little bit strange, I would just expect to 
> have a pair of iterators... but I see that it might be useful
> bidirectional iterator: this is a doubly linked list, I think that this 
> class of iterators cannot easily be described just as a range, it often 
> needs three points (start,end,actual_pos), I think has its place (and is 
> not explicitly present in your list)
> random_iterator: (this could be also called array type or linear indexed 
> type).

I can't understand much of the above, sorry.

> So this is what "my" iterator/range would look like :)

I encourage you to realize your design. Before long you'll find probably 
even more issues with it than I mentioned above, but you'll be gained in 
being better equipped to find proper solutions.


Andrei


More information about the Digitalmars-d-announce mailing list