foreach vs. iterators

Fri Nov 3 08:45:09 PST 2006

Chris Nicholson-Sauls wrote:
> Bill Baxter wrote:
> 
>> Sean Kelly wrote:
>>
>>> Walter Bright wrote:
>>>
>>>> Sean Kelly wrote:
>>>>
>>>>> I think I agree, but for the sake of argument, how would D handle 
>>>>> 'find'  operations on sequences that don't support opIndex?  At 
>>>>> some point, a generalizable bookmark is almost necessary for a 
>>>>> faithful implementation of some C++ algorithms.  Also, many of 
>>>>> these algorithms also require iteration across two sequences 
>>>>> simultaneously, which doesn't map very cleanly into foreach.  
>>>>> Consider a substring-style find operation (std::search), for 
>>>>> example, where both the pattern and target types do not support 
>>>>> opIndex or opSlice.  I might argue that it's a bit silly or 
>>>>> unrealistic to desire such an operation, but the C++ algorithm 
>>>>> library does support it.
>>>>
>>>>
>>>>
>>>>
>>>> I think the target types will have to support opIndex or opSlice.
>>>
>>>
>>>
>>>
>>> Forgive me for resurrecting a horribly battered equine, but I was 
>>> thinking about this a bit tonight and I've decided that foreach, 
>>> foreach_reverse, and array operations are not a sufficient general 
>>> substitute for C++ iterators.
>>>
>>> For example, let's say I want to compute the set union of two sorted 
>>> ranges, one of size M, the other of size N.  By iterating across the 
>>> two ranges simultaneously is is easy to see that the complexity for 
>>> the operation will be max(M,N).  In C++ this is easily accomplished 
>>> using forward iterators, and the ranges can be anything from arrays 
>>> to SQL result sets.
>>>
>>> In D however, there is no way to iterate across multiple ranges 
>>> simultaneously using foreach, so opIndex must be used instead.  With 
>>> containers that have a constant-time lookup cost this isn't a big 
>>> deal, but what if these containers are binary trees?  The complexity 
>>> for such an operation would be M(log(M)) + N(log(N)).
>>>
>>>  From this it seems clear that foreach is not sufficiently adaptable 
>>> to meet the needs of all sequential algorithms and opIndex is not a 
>>> reasonable substitute for all cases where foreach may not be used.  
>>> What alternatives do we have?  So far, iterator variants are all I've 
>>> been able to come up with.
>>
>>
>>
>> It would be sufficient if D had coroutines and opApply were 
>> implemented using them.
>>
>> The problem is that once you call an opApply function right now, it 
>> has to run to completion.  The only way I can see to have multiple 
>> opApply style iterators going simultaneously if you could truly 
>> suspend execution of one opApply temporarily and run another one for 
>> an iteration.
>>
>> In other words, if you're going to use the stack and function state to 
>> store iterator state, then the only way to have multiple iterators in 
>> different states simultaneously is to be able to have multiple stacks 
>> active simultaneously, aka coroutines.
>>
>> --bb
> 
> 
> Just a thought I had.  Would it be possible to emulate this behavior 
> with delegate-as-aggregate?  What I'm thinking of (and bear in mind I 
> just rolled out of bed, so I'm at quarter-speed right now ;)) is pass 
> foreach a "driver" delegate that then calls step iterator delegates 
> you've provided.  The driver and step iterators would be responsible for 
> maintaining state, and would probably need a reset case (at least the 
> iterators).
> 
> -- Chris Nicholson-Sauls

I was tinkering around with that.  But the problem is that any time you 
call an opApply you end up inside a function that basically does 
something like:

      for (i=0; i<length; i++) {
         ret = loop_body_dg(my_ith_item[i]);
      }

There's no way I can see to stop that juggernaut before it's done, short 
of having it return, and if it returns then it'll lose its state. 
Conceptually what you need is a 'yield' that keeps the current stack 
frame alive, but returns to the caller.  Then you could implement 
something like a zip() who's opApply calls the opApply's of each of its 
arguments, which in turn just run one iteration and each yield one value 
back to the zip().  Zip then packages up the results from one iteration 
of each of its args into a tuple and yields that back to the foreach block.

--bb