Iterators for D

Oskar Linde oskar.lindeREM at OVEgmail.com
Tue Nov 7 11:15:52 PST 2006


Walter Bright wrote:
> It's becoming increasingly obvious that D needs iterators. While opApply 
>  is far better for some kinds of iteration (such as recursively 
> traversing a directory), iterators are more efficient in some cases, and 
> allow for things that opApply makes difficult.
> 
> So hear are the design goals:
> 
> 1) Works with dynamic and stat arrays
> 2) Doesn't need to work with associative arrays
> 3) Should be possible to achieve "pointer efficiency" with it
> 4) Needs to be able to restrict lvalue access (what C++ does with const 
> iterators)
> 5) Needs to work seamlessly with foreach
> 6) Iterators need to be copyable

I apologize for this long post. Writing too much has become a bad habit 
of mine lately it seems. This took a while writing, and in the meantime 
new posts appeared that say much of the same. If you don't bother 
reading this, read Sean's posts instead, as I agree 100 % with him on 
this. :)

What I suggest is that we should not try to copy the C++ "iterator == 
pointer" idiom, but instead aim for an iterator design that is 
self-contained and simple.

For me, the iterator concept is simple. An iterator is an entity that 
allows an iterative way of accessing data. An iterator needs:
1. a way to reference the data
2. a way to traverse the data
3. a way to signal the end of the data

(In C++, many iterators lack #3 above, which leads to the necessity of a 
second iterator that just signals the end of the iterative range.)

An iterator could also be a generator.
Here are some examples (translated to D) of how iterators are 
implemented in some languages other than C++:

Java
----

class Iterator(T) {
	bool hasNext(); // self explanatory
	T* next();      // steps and returns reference to next element
}

while(i.hasNext())
	writefln(*i.next());


C# (Simplified, from David Gileadi)
-----------------------

class Iterator(T) {
	bool MoveNext();  // false if no more elements
	T Current();      // property getter
}

while(i.MoveNext)
	writefln(i.Current);


Python
------

class Iterator(T) {
	T next();
}

while(true)
	try
		writefln(i.next);
	catch (StopIteration)
		break;


Taking inspiration from the above and following Walter's design goals 
above, here is my suggestion:

interface Iterator(T) {
	bool step();     // as C# MoveNext
	T value;         // getter (rvalue access)
	T value(T);      // setter (lvalue access)
	int opApply(...) // support foreach style iteration
}

An alternative is supplying a value property that returns a pointer to 
the element, but that would not allow limiting lvalue access for read 
only iterators and introduces a pointer, which might be bad by itself.

Usage:

// explicit iteration
while(i.step) {
	writefln(i.value);
	i.value = 5;
}

// implicit iteration
foreach(o; i)
	writefln(o);


I will briefly explain my reasoning:

As I said, I believe the iterator should be clean and simple. Saying 
that it is OK for the iterator implementations to be dirty and messy, 
but that the implementations will be hidden away in libraries and only 
be the headache of library writers will lead to just that - only library 
writers will write iterators.

As Sean hints at, "random access iterators" are not pure iterators. The 
random access part is orthogonal to the iterative part. In D, a perfect 
way to implement a random access view is defining a .length property and 
an opIndex method. A random access view doesn't need to be iterable and 
an iterator doesn't need to provide random access. A D slice is the 
perfect example of a random access view.

I see no reason to use operator overloading when the result isn't 
replaceable with a pointer anyway.

Bill Baxter's points regarding iterator as a concept vs an 
interface/abstract class are very valid. And as Bill says, maybe both 
should be supported. One of them could always be converted to the other.

Regarding efficiency, here is one sample implementation that a compiler 
should(*) be able to make just as efficient as for/foreach loops:

struct Iterator(T) {
     T* ptr;
     T* end;

     bool step() { return ++ptr != end; }
     T value() { return *ptr; }
     T value(T x) { return *ptr = x; }
     // mixin OpApplyMixin;
}

Iterator!(T) iter(T)(T[] x) {
     Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE
     ret.ptr = x.ptr-1;
     ret.end = x.ptr+x.length;
     return ret;
}

void main() {
     int[] arr = [1,2,3,4,5];

     auto i = arr.iter();

     while(i.step) {
         writefln("%s",i.value);
         i.value = 5;
     }
}

/Oskar



More information about the Digitalmars-d mailing list