Iterators for D

Mon Nov 6 15:20:37 PST 2006

Bill Baxter wrote:
> Walter Bright wrote:
>> It's becoming increasingly obvious that D needs iterators. While 
>> opApply  is far better for some kinds of iteration (such as 
>> recursively traversing a directory), iterators are more efficient in 
>> some cases, and allow for things that opApply makes difficult.
>>
>> So hear are the design goals:
>>
>> 1) Works with dynamic and stat arrays
> 
> How about pointers (unknown length arrays)?
>   int *some_numbers;
> 
> Is it required that a pointer be a valid iterator?
> I.e. should it be possible to have this template function:
> 
>    print_all(some_numbers, some_numbers+theLength);
> 
> Or will that case require a wrapper object that contains theLength?
> 
>    auto iter = PointerIterator(some_numbers, theLength);
>    print_all(iter);

It's simpler than that. Pointers can be converted to arrays:
	auto array = some_numbers[0 .. theLength];
and then you can get the iterator.

>> 2) Doesn't need to work with associative arrays
> Why not?  Do you mean it doesn't need to because you can just iterate 
> over keys or values?

Because that falls into my big complaint with iterators - there's no way 
to efficiently 'linearize' a binary tree. Might as well just iterate 
over the keys or values.

> That's like C++'s way.  Iterator is basically a generalized pointer.

Yes.

> The other proposal is more like Java/Python/C#'s way, where the iterator 
>  is like a pointer that knows it's own limits.
> The Java/Python/C# way would definitely work in D.   I'm not sure about 
> the C++ way working for D because, in practice, references and 
> dereferencing get used pretty heavily there.  On the other hand 
> Java/Python/C# were designed without real templates.

I think the C++ like way will be a lot more efficient, and I think it 
will work. Reference types are not needed.

> One of the major design goals for C++ was that for a basic array a 
> regular pointer should be a valid iterator.  Thus the iterator cannot 
> know where it's own end() is, because a pointer does not have such 
> knowledge.  But aside from cute demos on SGI's STL web page where 
> algorithms are applied to plain arrays using pointers,  I've never seen 
> that sort of thing used in the real world.  You almost always use a 
> std::vector or std::list or something besides a raw array.  So pointer 
> equivalence seems to be pretty useless to me.  Even moreso with D.  In 
> the rare case you need to work with raw pointers, you can always make an 
> iterator wrapper.

I think it'll work fine, since pointers can be easily converted to 
dynamic arrays (see above).

> On the plus side for C++ style:
> * iterators are lightweight (just one size_t/pointer in most cases)
> * algorithms and methods all take begin/end pairs making things uniform.
> Minuses
> * Iterators are dumb and error prone and easily go out of bounds
> * algorithms all require begin/end pairs making things cumbersome
> 
> I think the minuses outweigh the plusses.
> But maybe it's useful to take one concrete example:  A generic mergesort 
> or quicksort algorithm.
> 
> In C++ recursive calls just pass the limits of the range to the children
> mergesort(begin, end)
> {
>     ...
>     mergesort(begin, (end-begin)/2);
>     mergesort((end-begin)/2, end);
> }
> It's very efficient.
> 
> In the iterator-is-object approach I'm not sure how that works.
> If all you've got is .next() or .hasNext() I'm not sure what you're 
> supposed to do to create an iterator over half the range represented by 
> the original iterator.

Iterators can be any type, not just objects, that support [], ++, +=, and =.

>> foreach loops will not be able to have a 'key' parameter.
> Don't understand that one.

What's the key for an unknown iterator type?