Iterators for D
Kirk McDonald
kirklin.mcdonald at gmail.com
Mon Nov 6 17:20:56 PST 2006
Walter Bright wrote:
> Kirk McDonald wrote:
>> Walter Bright wrote:
>>> Overloading * doesn't work with D. But, instead:
>> What are the arguments against that, again?
>
> Because D has many cases of implicit dereferencing, I thought it would
> cause more trouble than it's worth.
>
>> I do not recall, and it's worth at least reviewing why we don't have
>> opDeref and opDerefAssign.
>>
>> As you know, Walter, a major benefit of C++-style iterators is that
>> they are semantically as close as possible to pointers. Would [] be
>> changed to dereference pointers?
>
> You've always been able to:
> int* p;
> p[1] = ...
>
Wouldn't that be p[0] = ...?
Is that how you intend to use opIndex to "dereference" the iterator?
That's /terrifying/. Although it is consistent with pointers, using [0]
for everything is just asking for trouble with regards to
non-random-access iterators. And though it's probably not totally
meaningless to a random reader of the code, it comes pretty darned close.
>> I sure hope not. If we go this route, I would suggest adding these
>> unary-* overloads.
>>
>>> Overload opIndex for rvalue access
>>> Overload opIndexAssign for lvalue access
>>>
>>
>> I think you mean opSlice and opSliceAssign.
>
> No, I meant Index.
>
>>> Overloading opIndex also will work for random access
>>>
>>> foreach loops will not be able to have a 'key' parameter.
>>
>> So, I would assume these iterators would work something like this:
>>
>> If a class provides .begin and .end, and these both return the same
>> type, and that type provides the requisite iterator methods, it is
>> foreach-able.
>
> Yes, though they will return a value, not a type.
>
s/these both return the same type/their return types are the same/
>> If a class provides .begin, .end, and opApply, one of the two
>> iteration methods has to take precedence. I would hope it's opApply.
>
> I was leaning towards the other way around.
>
It is slightly easier and clearer to explicitly say:
foreach(e; t.begin) { }
than
foreach(e; &t.opApply) { }
opApply is an operator overload. .begin would just be a special method.
I would expect the operator to take precedence.
>> For a type T, its associated iterator type should be available via
>> T.iterator. This has to be standard.
>
> It's not needed, as typeof(T.begin) will do it.
>
Due to the way in which D's type inference from properties works, you'll
need typeof(T.begin()). Otherwise, it will derive the type of the method
itself. (Which isn't even a valid type, as such.)
>> An iterable type T MUST provide these methods:
>> I begin()
>> I end()
>
> Yes.
>
>> An iterator type I MUST provide the following overloads:
>> E opSlice()
>
> No. opIndex()
I'm referring specifically to the no-index form of opSlice (meaning i[]
would "dereference" the iterator). (A no-index opIndex doesn't work.)
This of course is inconsistent with pointers, so saying i[0] is better
on that count.
>
>> I opPostInc()
>
> Probably opAddAssign()
>
Oh, there's another one:
int opEquals(I)
>> E is the type of an element in the collection.
>
>
>> These are enough to describe a read-only forward iterator, which is
>> adequate for foreach. In other words, given a type T that meets the
>> requirements outlined above, the following is adequate to iterate
>> through it:
>>
>> foreach(e; t) {
>> // ...
>> }
>>
>> 'e' would be of type E.
>
> Yes.
>
>> We run into a curious problem involving pre-increment, which uses
>> opAddAssign. Classes for which random-access is problematic should not
>> be required to provide this, which is essentially a random-access
>> method. There are two options here:
>>
>> 1) Only require that iterators support opPostInc. (This further
>> removes them from actual pointers.)
>>
>> 2) Introduce a convention whereby a non-random-access iterator throws
>> an exception if any value other than 1 is passed to its opAddAssign
>> overload.
>
> I could go either way with that.
>
Neither one is particularly elegant.
>> Extending these rules for bidirectional iterators and random-access
>> iterators, as well as read-write iterators, is left as an exercise to
>> the reader. (And not a very difficult one.)
>>
>> All of these strange edge-cases and differences between pointers and
>> possible iterator types convince me that C++-STYLE ITERATORS ARE NOT
>> FOR D. D, unlike C++, does not supply enough operator overloads for
>> their semantics to be identical to those of pointers, which was the
>> whole point of C++-style iterators in the first place.
>
> I think it does provide enough, in fact, it will be less wacky than in
> C++ (look at the wretched iterator/const_iterator dichotomy). It'll work
> with core arrays, and will not need those typedefs.
>
Without the ability to overload the dereference operator, any attempt to
write a class that behaves like a pointer in D will be unwieldy, or at
least ugly. Admittedly, this isn't much of an issue when just using
foreach, but might be an issue with any kind of STL-like algorithms
library. (Which is what you're inviting when using C++'s iterator
semantics.)
>
>> Instead, we should look to other languages for inspiration. I suggest
>> Java and Python, whose iterator semantics are similar. Other posters
>> have already made suggestions on this front. I would only suggest the
>> "opIter" method for getting an iterator, and that the iterator class
>> itself be available via T.iterator (whether it is a nested class or an
>> alias shouldn't matter). Classes can provide methods for returning
>> alternate iterators, as well. (A class providing opIter should be
>> iterated over as easily as an iterator itself.)
>
>
--
Kirk McDonald
Pyd: Wrapping Python with D
http://pyd.dsource.org
More information about the Digitalmars-d
mailing list