RFC on range design for D2
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Wed Sep 10 13:16:45 PDT 2008
Steven Schveighoffer wrote:
> "Andrei Alexandrescu" wrote
>> Notice that "a range can't grow" is different from "a range can't be
>> assigned from a larger range". In particular, a range operation can return
>> a range larger than both input ranges. But not larger than their union
>> :o).
>
> Yes, so every time you add an element you have to update your forward range
> from the 'all' range so it includes the new element at the end. Every time
> you remove an element, you have to update your reverse range from the 'all'
> range so it excludes the element at the beginning. Failure to do this
> results in invalid ranges, which seems to me like a lot more work than
> simply not doing anything (in the case of an iterator). The pitfalls of
> using ranges for dynamically changing containers might outweigh the
> advantages that they provide in certain cases.
No, this is incorrect. I don't "have to" at all. I could define the
behavior of range as you mention, or I could render them undefined.
Iterators invalidate anyway at the drop of a hat, so they're none the
wiser. You can't transform a lack of an advantage into a disadvantage.
"Look at this pineapple. It's fresher than the other, and bigger too."
"No, it's about as big. That pineapple sucks."
>>> My belief is that ranges should be the form of input to algorithms, but
>>> iterators should be provided for using containers as general data
>>> structures. Similar to how strings are represented by arrays/slices, but
>>> iterators (pointers) exist if you need them.
>> If we agree it's better without iterators if not needed, we'd need a
>> strong case to add them. Right now I have a strong case against them.
>
> I don't need to worry about whether you have them or not, I can always
> implement them on my own ;) Really, range/iterator support doesn't require
> direct support from the compiler (except for builtin arrays), and any
> improvements made to the compiler to support ranges (such as reference
> returns, etc) can be applied to iterators as well.
>
> I think ranges are an excellent representation when a range of elements is
> needed. I think a cursor or iterator is an excellent representation when an
> individual position is needed.
>
>>> I'll probably move forward with this model in dcollections, I really like
>>> the range idea, and in general the view on how ranges are akin to slices.
>>> But I also like having access to iterators for other functions.
>> Which functions?
>
> Functions which take or return a single position. Such as 'erase the
> element at this position' or 'find the position of element x'.
I agree. In fact I agreed in my original document, which I quote:
``Coding with ranges also has disadvatages. Some algorithms work
naturally with individual iterators in the "middle" of a range. A
range-based implementation would have to maintain a redundant range
spanning e.g. from the beginning of the container to that middle.''
However, I could meanigfully rewrite std.algorithm to work on ranges
alone. The disadvantage does exist but is minor, For example, find does
not return an iterator. It simply shrinks its input range until the
element is found, or until it is empty. That way you can nicely use the
result of find iteratively.
Range find(alias pred = "a == b", Range, E)(Range haystack, E needle)
{
alias binaryFun!(pred) test;
for (; !haystack.isEmpty; haystack.next)
{
if (test(haystack.first, needle)) break;
}
return haystack;
}
This is quite a few bites smaller than the previous version, which is
now to be found in std.algorithm:
Iterator!(Range) find(alias pred = "a == b", Range, E)(Range haystack, E
needle)
{
alias binaryFun!(pred) test;
for (auto i = begin(haystack); i != end(haystack); ++i)
{
if (test(*i, needle)) return i;
}
return end(haystack);
}
It has two less variables, and VERY importantly, one less type to deal
with. Arguments aired against primitive ranges systematically omit this
important simplification they bring. When you don't weigh in the
advantages, of course all there is to be seen are the disadvantages.
Better yet, when find does return, client code's in better shape because
it doesn't need to compare the result against end(myrange). It can just
test whether it's empty and be done with.
So a newcomer to D2 would have to have an understanding of containers
and ranges. Containers own data. They offer various traversals to crawl
them in the form of ranges. Ranges are generalized slices.
If iterators are thrown into the mix, things get more complex because
iterators are a lower-level primitive, a generalized pointer. So the
newcomer would have to ALSO understand iterators and deal with functions
that require or return either. They'd have to learn how to pair
iterators from ranges and how to extract iterators from ranges (more
primitives). They'd also have to understand when it's better to hold on
to a range (most of the time) versus a naked iterator (seldom and for a
dubious convenience).
I /understand/ there are advantages to iterators. Just don't forget the
cost when discussing them.
I am also sure that if I sat down long enough contemplating my navel I
could come with more examples of iterators=good/ranges=bad. I am also
sure that if I continued doing so I could also figure cases where a
doubly linked-list iterator that "knows" whether it's atBegin or atEnd
could come in handily. In fact how about this imaginary discussion
between Stepanov and his imaginary colleague Tikhonov:
Stepanov: "Here, I have these cool iterators. I can express a great deal
of stuff with them. It's amazing."
Tikhonov: "Ok, I have a doubly-linked list. An iterator is a node in the
list, right?"
S: "Da. Those are bidirectional iterators because they can move in two
directions in the list."
T: "Ok, my first element has prev == NULL and last element has next ==
NULL. Does your iterator know when it's at the begin and at the end of
the list?"
S: "No. You see, you'd have to compare two iterators to figure that out.
Just pass around an iterator to the beginning and end of the list
fragment you're interested in, as you find fit."
T: "That sucks! I want an iterator to move up and down and tell me when
it's at the beginning and at the end, without another stinkin' iterator."
S: "I implemented a great deal of algorithms without needing that. What
algorithms of yours can't work with a comparison instead of atBegin and
atEnd?"
T: "Well, I need to think of it. Maybe some video buffer or something."
S: "That works. You just save the iterator at the beginning of the
sliding window. Then you compare against it."
T: "But I don't like that. I want you to define atBegin and atEnd so I
don't need to carry an extra laggard!"
S: "Then what algorithms would fundamentally rest on that particular
feature?"
T: "No idea. Here, let me look at my navel."
S: "While you do that, let me ask you this. Whatever those algorithms
are, they can't work on a circular list, right?"
T: "I guess not. atBegin is never true for a circular list, unless,
damn, you keep another iterator around to compare with."
S: "So those algorithms of yours would work on a doubly-linked list but
not on a circular list. Are you sure you care for that distinction and
that loss in generality?"
T: "Gee, look at the time. It's Friday evening! Let's go have a beer."
Andrei
More information about the Digitalmars-d-announce
mailing list