Iterators for D

Kirk McDonald kirklin.mcdonald at gmail.com
Mon Nov 6 14:06:55 PST 2006


Walter Bright wrote:
> It's becoming increasingly obvious that D needs iterators. While opApply 
>  is far better for some kinds of iteration (such as recursively 
> traversing a directory), iterators are more efficient in some cases, and 
> allow for things that opApply makes difficult.
> 
> So hear are the design goals:
> 
> 1) Works with dynamic and stat arrays
> 2) Doesn't need to work with associative arrays
> 3) Should be possible to achieve "pointer efficiency" with it
> 4) Needs to be able to restrict lvalue access (what C++ does with const 
> iterators)
> 5) Needs to work seamlessly with foreach
> 6) Iterators need to be copyable
> 
> So here's one design:
> 
> .begin property returns an iterator that starts at the beginning
> 
> .end returns an iterator that is at the end
> 
> Overloading * doesn't work with D. But, instead:
> 

What are the arguments against that, again? I do not recall, and it's 
worth at least reviewing why we don't have opDeref and opDerefAssign.

As you know, Walter, a major benefit of C++-style iterators is that they 
are semantically as close as possible to pointers. Would [] be changed 
to dereference pointers? I sure hope not. If we go this route, I would 
suggest adding these unary-* overloads.

> Overload opIndex for rvalue access
> Overload opIndexAssign for lvalue access
> 

I think you mean opSlice and opSliceAssign.

Bar bar;
auto foo = bar.begin();
Bar baz = foo[];
foo[] = Bar.init;

> Overloading opIndex also will work for random access
> 
> foreach loops will not be able to have a 'key' parameter.

So, I would assume these iterators would work something like this:

If a class provides .begin and .end, and these both return the same 
type, and that type provides the requisite iterator methods, it is 
foreach-able.

If a class provides .begin, .end, and opApply, one of the two iteration 
methods has to take precedence. I would hope it's opApply.

For a type T, its associated iterator type should be available via 
T.iterator. This has to be standard.

An iterable type T MUST provide these methods:
I begin()
I end()

An iterator type I MUST provide the following overloads:
E opSlice()
I opPostInc()

E is the type of an element in the collection.

These are enough to describe a read-only forward iterator, which is 
adequate for foreach. In other words, given a type T that meets the 
requirements outlined above, the following is adequate to iterate 
through it:

foreach(e; t) {
     // ...
}

'e' would be of type E.

We run into a curious problem involving pre-increment, which uses 
opAddAssign. Classes for which random-access is problematic should not 
be required to provide this, which is essentially a random-access 
method. There are two options here:

1) Only require that iterators support opPostInc. (This further removes 
them from actual pointers.)

2) Introduce a convention whereby a non-random-access iterator throws an 
exception if any value other than 1 is passed to its opAddAssign overload.

Extending these rules for bidirectional iterators and random-access 
iterators, as well as read-write iterators, is left as an exercise to 
the reader. (And not a very difficult one.)

All of these strange edge-cases and differences between pointers and 
possible iterator types convince me that C++-STYLE ITERATORS ARE NOT FOR 
D. D, unlike C++, does not supply enough operator overloads for their 
semantics to be identical to those of pointers, which was the whole 
point of C++-style iterators in the first place.

Instead, we should look to other languages for inspiration. I suggest 
Java and Python, whose iterator semantics are similar. Other posters 
have already made suggestions on this front. I would only suggest the 
"opIter" method for getting an iterator, and that the iterator class 
itself be available via T.iterator (whether it is a nested class or an 
alias shouldn't matter). Classes can provide methods for returning 
alternate iterators, as well. (A class providing opIter should be 
iterated over as easily as an iterator itself.)

-- 
Kirk McDonald
Pyd: Wrapping Python with D
http://pyd.dsource.org



More information about the Digitalmars-d mailing list