Iterator straw-man

Wed Nov 8 09:00:37 PST 2006

Bill Baxter wrote:
> Sean Kelly wrote:
> 
> I mostly agree with you.  But have some comments on the specifics.
> 
>> What follows is a rough description of a "smart" read-only iterator 
>> for D:
>>
>> interface Iterator<T>
>> {
>>     bool atEnd();
>>     T next();
>>     T curr();
>>     int opApply( int delegate( T ) dg );
>> }
>>
>> The nuances are unimportant, but the basic idea is that any such 
>> iterator should provide a means of advancing, testing whether the 
>> iterator references a valid element, and of referencing that element. 
>> Also, opApply is supported to allow iterators to be used in foreach 
>> operations for simple uses.
> 
> I don't think opApply should be the mechanism by which foreach accepts 
> iterators.  It's just too convoluted and too much a burden on iterator 
> implementors.  And it's not going to be as efficient as a raw for-loop, 
> either.

True enough.  I mostly added it as an alternative for situations where 
users didn't need the power of a manual loop.  And I think 
implementation would be pretty simple for most/all iterators:

     for( ; !atEnd(); next() )
     {
         ret = dg( curr() );
         // handle ret
     }

>> Assuming such a design, the iterator type determines progress through 
>> the container (forward, reverse, stepped, etc) rather than the 
>> algorithm being applied.  Here is a straightforward implementation of 
>> some algorithms using such iterators, 'C' is a container type for 
>> demonstration:
>>
>>     bool contains( C cont, E match ) {
>>         foreach( e; cont ) {
>>             if( e == match )
>>                 return true;
>>         }
>>         return false;
>>     }
> 
> It should be possible to make a template out of that, too.
> 
>      bool contains(C,E)( C cont, E match ) {
>          foreach( e; cont ) {
>              if( e == match )
>                  return true;
>          }
>          return false;
>      }
> 
> Current compiler limitations aside, that *should* be reducible to the 
> same efficiency as a for loop with simple pointer (or index) manipulation.

Yup.  The function above was meant for illustrating an idea more than it 
was a proposal for how to implement a contains function :-)

>> [...]
>> Given the above, it seems awkward that the iterator must be declared 
>> outside of foreach so a copy may be made to return the position of the 
>> found element.  Therefore, foreach should allow an arbitrary key type 
>> to be returned by opApply.  
> 
> I like that.  It makes sense.  The iterator itself is analogous to the 
> int index when foreach-ing arrays.  That seems very logical.
> 
> Given that similarity it seems like it would make sense for indexing a 
> container by an iterator to work as well:
> 
> foreach( i, e; cont.fwdIt ) {
>       if( e == match ) {
>           writefln("Matched item ", cont[i]);
>           return i;
>       }
> }
> 
> I.e. cont[i] becomes synonymous with i.curr().

Hrm, good point.

> That would be easy to arrange if opIndex was allowed to take an iterator 
> as the index.  (I think it's a good idea to remove the restriction on 
> opIndex arguments in any event, but this is another justification.)

Is it currently required to be an integer?  I had no idea.

> An opIndex for an iterator would take the simple form:
> 
>    static T opIndex(ContIterator i) { return i.curr(); }
> 
> If you had write-able iterator with say a .set(T v) method, then you 
> could also have opIndexAssign:
> 
>    static T opIndexAssign(ContIterator i, T x) { return i.set(x); }
> 
> If the Iterator type has a pointer to its container, you could make the 
> methods non-static and assert(this==i.container).

All good ideas.

>> As an aside, foreach_reverse may remain useful in instances where we 
>> want the index of the current element instead of an iterator.  Or 
>> array iterators may provide a means to obtain the relevant index.  
>> Perhaps someone could make an argument for or against based on these 
>> ideas.
> 
> foreach_reverse(i,x; cont) would remain (marginally) useful as a 
> shortcut for the special reverse iterator name:  I.e. a synonym for
>     foreach(i,x; cont.revIter)

True enough.

> I'm guessing the specially-ordained iterator method names will probably 
> end up being opIter and opIterReverse (or maybe same with 'Iterator' 
> spelled out) just following the pattern established so far.

Following the D convention, the 'op' prefix suggests a use form 
different from that of a normal function call.  Personally, I'd be fine 
with this done as plain old property methods: fwdIt/revIt, or something 
along those lines.

Sean