Iterators recap

Wed Dec 27 16:56:13 PST 2006

Oskar Linde wrote:
> 
> Now to bidirectional iterators.
> 
> Separating R, T, S as three different methods, here are two ways a 
> bidirectional iterator could be implemented:
> 
> The value-style:
> 
> BidirectionalIterator1(T) {
>     T* begin;
>     T* end;
>     T* current;
> 
>     bool hasValue() { return cursor != end && cursor != (begin-1); }
>     void moveNext() { current++; }
>     void movePrevious() { current--; }
> 
>     T value() { return *current; }
>     T value(T v) { return *current = v; }
> }
> 
> Silly usage example:
> 
> auto i = someIterator;
> while(i.hasValue) {
>     if (i.value > x)
>         i.moveNext;
>     else
>         i.movePrevious;
> }
> 
> I like how succinct this turns out, but it has the disadvantage of 
> internally needing two invalid values, one at the start and one at the 
> end.

Not necessarily.  The dummy node could be shared by both.  This is one 
reason why circular linked-list implementations are so common.  Things 
do admittedly get a bit weird for binary trees, however.

 > Considering pointers internally, this means a pointer pointing
> before the start of an array, which is very bad.

In C++, while a bidirectional iterator may be able to traverse in either 
direction, it still generally has some natural association as a forward 
or reverse iterator.  That is, when the iterator is first returned from 
the sequence it was likely either initialized pointing to the first or 
last element.  If it's the former then it's associated with forward 
iteration, if it's the latter then reverse iteration.  For example, the 
C++ vector iterator is bidirectional (actually, random), but iterators 
returned by the begin() method are effectively forward iterators while 
iterators returned by the rbegin() method are effectively reverse 
iterators.  Perhaps this is the correct approach to take here as well?

 > The other disadvantage
> is hasValue that has to check for both of these two cases.

True enough.  But range-checking is range-checking.  If it's possible 
for an iterator to go out of range and it's possible to check for this 
condition then one should do so, at least in debug builds.

> One alternative is the Java style version. Where the pointer/cursor lies 
> conceptually between the elements instead of at an element, giving:
> 
> BidirectionalIterator2(T) {
>     T* begin;
>     T* end;
>     T* cursor;
> 
>     bool hasNext() { return cursor != end; }
>     bool hasPrevious() { return cursor != begin; }
>     void moveNext() { cursor++; }
>     void movePrevious() { cursor--; }
> 
>     T next() { return *cursor; }
>     T next(T v) { return *cursor = v; }
> 
>     T previous() { return *(cursor-1); }
>     T previous(T v) { return *(cursor-1); }
> }
> 
> 
> auto i = someIterator;
> while(i.hasNext) {
>     if (i.next > 3)
>         i.moveNext;
>     else
>         i.movePrevious;
> }
>     
> The disadvantage is requiring two additional methods (hasPrevious and 
> previous), but the advantage is that it doesn't need more than 1 illegal 
> value (pointing 1 past the end, which is well supported).

The other disadvantage is that this returns to the original problem 
where it's impossible to determine whether an iterator points to a valid 
location without moving.

> Apart from different method names, are there other variants?

See above.  The C++ approach seems to make more sense to me, though I'm 
still mulling it over.  There is a definite appeal to having iterators 
that are truly direction-agnostic, but as you've demonstrated above it 
does result in slightly weird implementation requirements.  Since D is 
like C++ in that the one-past-the-end location is valid but 
one-past-the-beginning is not, it seems reasonable to preserve that 
behavior here, even if a more Java-like iterator design is used.

Sean