Iterator straw-man

Tue Nov 7 18:25:34 PST 2006

Sean Kelly wrote:

I mostly agree with you.  But have some comments on the specifics.

> What follows is a rough description of a "smart" read-only iterator for D:
> 
> interface Iterator<T>
> {
>     bool atEnd();
>     T next();
>     T curr();
>     int opApply( int delegate( T ) dg );
> }
> 
> The nuances are unimportant, but the basic idea is that any such 
> iterator should provide a means of advancing, testing whether the 
> iterator references a valid element, and of referencing that element. 
> Also, opApply is supported to allow iterators to be used in foreach 
> operations for simple uses.

I don't think opApply should be the mechanism by which foreach accepts 
iterators.  It's just too convoluted and too much a burden on iterator 
implementors.  And it's not going to be as efficient as a raw for-loop, 
either.

> Assuming such a design, the iterator type determines progress through 
> the container (forward, reverse, stepped, etc) rather than the algorithm 
> being applied.  Here is a straightforward implementation of some 
> algorithms using such iterators, 'C' is a container type for demonstration:
> 
>     bool contains( C cont, E match ) {
>         foreach( e; cont ) {
>             if( e == match )
>                 return true;
>         }
>         return false;
>     }

It should be possible to make a template out of that, too.

      bool contains(C,E)( C cont, E match ) {
          foreach( e; cont ) {
              if( e == match )
                  return true;
          }
          return false;
      }

Current compiler limitations aside, that *should* be reducible to the 
same efficiency as a for loop with simple pointer (or index) manipulation.

> [...] 
> 
> Given the above, it seems awkward that the iterator must be declared 
> outside of foreach so a copy may be made to return the position of the 
> found element.  Therefore, foreach should allow an arbitrary key type to 
> be returned by opApply.  

I like that.  It makes sense.  The iterator itself is analogous to the 
int index when foreach-ing arrays.  That seems very logical.

Given that similarity it seems like it would make sense for indexing a 
container by an iterator to work as well:

foreach( i, e; cont.fwdIt ) {
       if( e == match ) {
           writefln("Matched item ", cont[i]);
           return i;
       }
}

I.e. cont[i] becomes synonymous with i.curr().

That would be easy to arrange if opIndex was allowed to take an iterator 
as the index.  (I think it's a good idea to remove the restriction on 
opIndex arguments in any event, but this is another justification.)

An opIndex for an iterator would take the simple form:

    static T opIndex(ContIterator i) { return i.curr(); }

If you had write-able iterator with say a .set(T v) method, then you 
could also have opIndexAssign:

    static T opIndexAssign(ContIterator i, T x) { return i.set(x); }

If the Iterator type has a pointer to its container, you could make the 
methods non-static and assert(this==i.container).

> As an aside, foreach_reverse may remain useful in instances where we 
> want the index of the current element instead of an iterator.  Or array 
> iterators may provide a means to obtain the relevant index.  Perhaps 
> someone could make an argument for or against based on these ideas.

foreach_reverse(i,x; cont) would remain (marginally) useful as a 
shortcut for the special reverse iterator name:  I.e. a synonym for
     foreach(i,x; cont.revIter)

I'm guessing the specially-ordained iterator method names will probably 
end up being opIter and opIterReverse (or maybe same with 'Iterator' 
spelled out) just following the pattern established so far.

--bb