Iterators (Was: Re: Lazy eval -- an example issue)

Fri Aug 25 06:54:20 PDT 2006

Sean Kelly wrote:
> Walter Bright wrote:
>>
>> I've struggled to get people to accept the {} version ever since D 
>> adopted anonymous delegates. Haven't made much headway in getting such 
>> used or in it having any sort of significant impact. How many have 
>> made a "dotimes(n, exp)" function or any sort of syntax extension 
>> using it? None that I've seen.
> 
> I've been working on a predicate-oriented algorithm module on and off 
> for the past few weeks.  A few looping constructs have been left out 
> because foreach seems a preferable solution in D, but the rest is coming 
> along nicely.  

Interesting. I have been doing some of this myself too, but my available 
time has been limited. I have also been looking at the concept of 
iterators. I agree that foreach is the preferable solution, but in some 
cases, only supporting foreach is too limited. For example, only using 
foreach, you can not iterate over two collections simultaneously.

It would be great if we could come up with a standardized Iterator 
concept for D that supported both foreach style iteration (opApply) and 
some kind of single step iteration (maybe Java-style next/hasNext).

Here is a sample of how it could look (but this particular sample seems 
to give a segfault on DMD 0.165 for some reason):

template opApplyMixin() {
   int opApply(int delegate(inout ValueType v) dg) {
     ValueType *t;
     while(hasNext()) {
       t = next();
       if (auto status = dg(*t))
	return status;
     }
     return 0;
   }
}

/// Simple array iterator wrapper
struct ArrayIterator(T) {
   T[] arr;
   alias T ValueType;

   T* next() { T* r = arr.ptr; arr = arr[1..$]; return r; }
   bool hasNext() { return arr.length > 0; }
   mixin opApplyMixin;
}

I have been looking at implementing many of my functional algorithms as 
array views/iterators. For example:

foreach(word; read!(char)("file.txt").splitIterate(&isSpace)) {
	writefln("%s",word);
}

where splitIterate is implemented just as the regular split, but instead 
of allocating and returning char[][] array, it is implemented as an 
iterator returning the next char[] slice for each iteration without 
allocating any extra data.

read could probably be implemented in a smart way, reading the file a 
certain chunk at a time. Never using more than a constant amount of memory.

Other functions as iterators I have been implementing are:

split(array,predicate|substring|element)
select(array,predicate|element)
reverse(array)
range(start,stop,next)
generate(init,next)
map(array,func)
indexmap(array,indexarray|nextindexfunc)
and so on...

> About the only problem I had was inlining delegates 
> passed to the functions, so the default operations are passed as structs 
> with opCall defined instead.

I've encountered this too. Delegates passed to functions are often 
called in tight inner loops. So far, I've never seen DMD doing a good 
job at inlining those cases. Structs with opCall is an excellent idea, 
that I will steal right away. :) It would be interesting to know if we 
in the future can count on const delegates being inlined just as 
efficiently.

/Oskar