Iterators for D

Wed Nov 8 13:26:20 PST 2006

Sean Kelly wrote:
> Daniel Keep wrote:
> 
>>
>> Walter Bright wrote:
>>
>>> Sean Kelly wrote:
>>>
>>>> Since D has slicing, the argument for using iterators to define the
>>>> boundaries of a range of randomly accessible elements seems kind of
>>>> small to me.  ie.
>>>>
>>>>     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
>>>> or
>>>>     sort( a[0 .. 5] );
>>>>
>>>> I find the second to be cleaner to look at.  But I'm undecided whether
>>>> we'd lose any power or flexibility by not bothering with random access
>>>> iterators.
>>>
>>> Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
>>> That definitely sounds like a good area to explore.
>>
>>
>> Hang on; aren't we back to where we are *right now*?
> 
> 
> Essentially.  But D has no formal description of an iterator type.  I 
> think it would be a useful convention to establish, and doing so 
> requires determining what types of iterators should be supported, and how.
> 
>> Things that need random access override opIndex and opIndexAssign.
>>
>> Things which you can take a range of values from override opSlice and
>> opSliceAssign.
>>
>> Things that require lock-step iteration override opApply, or supply a
>> function that returns a delegate that "runs" a foreach loop.
>>
>> About the only thing this *doesn't* cover are bi-directional iterators
>> (say, iterating back and forth over an infinite series).
> 
> 
> This wouldn't be too difficult to do by adding a prev() method and 
> atBegin() or some such.  Or maybe the Java style of hasNext(), next(), 
> hasPrev(), prev().
> 
>> I know some people want to be able to stop iteration and resume it, but
>> couldn't that be solved by making opApply return a generator like what
>> Python does?  Those things are written the same way as a current opApply
>> (sans all the return value from the delegate stuff), and can be advanced
>> manually.
>>
>> I know this can be done.  I wrote the coroutine module for StackThreads
>> a while back, and one of the *very first* things I did was implement
>> generic generators.  The base generator class essentially just
>> implemented opApply switching over to the coroutine every time it needed
>> the next value.  Here's a port of Python's range() function (well, the
>> one and zero argument versions :P):
>>
>>> class
>>> range : Generator!(uint)
>>> {
>>>     // These just let you use range() or range(n)
>>>     static
>>>     range
>>>     opCall()
>>>     {
>>>         return range(uint.max);
>>>     }
>>>
>>>     static
>>>     range
>>>     opCall(uint limit)
>>>     {
>>>         return new range(limit);
>>>     }
>>>
>>> protected:
>>>     uint limit;
>>>
>>>     // Or you can use new range(n)
>>>     this(uint limit)
>>>     {
>>>         this.limit = limit;
>>>         super();
>>>     }
>>>
>>>     // Where the magic happens.
>>>     override
>>>     uint
>>>     run()
>>>     {
>>>         uint i = 0;
>>>         while( i < limit )
>>>             yield(i++);    // yield in D: *very* sexy
>>>
>>>         StopGenerator();   // throws an exception, imitating what
>>>                            // Python does to stop an iterator.
>>>     }
>>> }
>>
>>
>> Now, from that, we can implement opApply AND next/current in the base
>> class.  Going with coroutines kills both the opApply and forward
>> iterator birds with one stone.
> 
> 
> Interesting idea--I'll admit I don't have much experience with Python. 
> But I think both approaches have value.  Using coroutines just to step 
> across two sequences simultaneously seems unnecessarily complex, and not 
> terribly efficient compared to a simple pointer increment.  But it does 
> offer a tremendous amount of power for more complex cases.
> 
>> As for bi-directional iterators... I actually can't think of anywhere
>> I'd use them.  I just keep thinking "I could just use random access to
>> the same end".  Python's never suffered from not having them...
> 
> 
> I don't think I've ever needed them either, but they can come in handy 
> when implementing one container in terms of another.  Say an iterable 
> stack and queue (unlikely, I know) that are implemented via a doubly 
> linked-list.

I wrote some mesh (graph) manipulation algorithms in Python.  I really 
needed a linked list for that and a way to point to e.g. an edge in a 
list of edges coming out of a vertex that wouldn't be invalidated by 
inserting edges here and there before or after it, and I needed both to 
find the previous edge and the following edge.  In C++ std::list would 
have been the clear choice.

Doing that with indexes into to a standard python list just isn't 
practical (I started out trying it that way because Python FAQ's seem to 
say "you don't need a linked list, just use a python list".  But with a 
python list, every time you insert one new item you have to go find 
everyone out there that's holding onto an index into this list and get 
them to update their values.  Just not practical.

So sometimes bidirectional iterators are needed.  Even in Python.

--bb