foreach vs. iterators
Georg Wrede
georg.wrede at nospam.org
Fri Nov 3 16:51:50 PST 2006
Walter Bright wrote:
> Sean Kelly wrote:
>
>> Forgive me for resurrecting a horribly battered equine, but I was
>> thinking about this a bit tonight and I've decided that foreach,
>> foreach_reverse, and array operations are not a sufficient general
>> substitute for C++ iterators.
>>
>> For example, let's say I want to compute the set union of two sorted
>> ranges, one of size M, the other of size N. By iterating across the
>> two ranges simultaneously is is easy to see that the complexity for
>> the operation will be max(M,N). In C++ this is easily accomplished
>> using forward iterators, and the ranges can be anything from arrays to
>> SQL result sets.
>>
>> In D however, there is no way to iterate across multiple ranges
>> simultaneously using foreach, so opIndex must be used instead. With
>> containers that have a constant-time lookup cost this isn't a big
>> deal, but what if these containers are binary trees? The complexity
>> for such an operation would be M(log(M)) + N(log(N)).
>>
>> From this it seems clear that foreach is not sufficiently adaptable
>> to meet the needs of all sequential algorithms and opIndex is not a
>> reasonable substitute for all cases where foreach may not be used.
>> What alternatives do we have? So far, iterator variants are all I've
>> been able to come up with.
>
> You are correct in regards to looping across multiple aggregates
> simultaneously.
>
> On the other hand, C++ iterators aren't a general substitute for
> opApply. For example, it is not at all easy to turn a recursive data
> structure (i.e. binary tree) into a linearized iterator.
If I'm not missing the essence here, there are several kinds of
iterators, and not every data structure should try to provide all of
these. For example, an iterator into a tree structure might be
breadth-first or depth-first, but one wouldn't expect a linearized (if I
understand the term in this context at all) iterator to exist.
Conversely, an array is hardly thinkable without a linear iterator.
What Stepanov explained to me at length (back when I was studying CS and
there was not a single book published on the STL) was that the different
containers all have their specific sets of "natural" iterators, and that
only where it is convenient or at all implementable, other iterators may
exist to them. And that it's not like you have a container and then need
a certain iterator to it, but that you choose the container based on
what kind of iterators you need in the first place.
Then it's about the iterator just being a way to reach every item
exactly once. And if you don't seem to get them in the order you want,
then you sort the items, say into a list. (Or rethink the problem.)
Back to foreach: it's syntactically a very convenient way to go through
a trivial collection (i.e. an array or a list). But for non-trivial
containers (or data structures), foreach simply is insufficient. You
need an iterator. Iterators were invented for the very purpose of
walking through a black-box container, if you will. I'd almost say that
container-specific iterators are the only way to guarantee the lowest
complexity walk-through of an arbitrary black-box container.
That said, it is of course customary to provide "wrong" iterators to
containers (where at all possible, as conveniences), for example a
random access iterator to a linked list. But then the user should stay
aware that using such is only for the cases where the choice of
container has already been evaluated to be optimal, and where one only
occasionally needs to access some of the items in a specific way. In
other words, where this need is seldom enough to not warrant a temporary
copy (or linking) of the items into a more suitable container.
One could of course have foreach actually employ iterators. This gives
the ease of coding with foreach combined with the flexiblity, decoupling
and orthogonality offered by iterators.
But, as in the previous post, even this becomes cumbersome when we're
iterating over more than one container simultaneously, especially when
they're of different sizes or iterating in non-lockstep.
---
In general, the essence and elegance of the STL has not been too
prevalent in D libraries or algorithms. With the current expressive
power of D, one might expect to soon start seeing some of this.
More information about the Digitalmars-d-announce
mailing list