RFC on range design for D2

Tue Sep 9 06:28:48 PDT 2008

On Tue, 09 Sep 2008 04:37:41 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> Denis Koroskin wrote:
>> 1) There is a typo:
>>  // Copies a range to another
>> void copy(R1, R2)(R1 src, R2 tgt)
>> {
>>     while (!src.isEmpty)
>>     {
>>         tgt.putNext(r.getNext);  // should be tgt.putNext(src.getNext);
>>     }
>> }
>
> Thanks! Fixed.
>
>> 2) R.next and R.pop could have better names. I mean, they represent  
>> similar operations yet names are so different.
>
> I agree. Next was a natural choice. I stole pop from Perl. Any symmetric  
> and short operation names would be welcome.
>

1) R.left += n / R.right -= n
2) R.left.advance(n) / R.right.advance(n) (or move)
3) R.advanceLeft(n)/R.advanceRight(n) (or moveLeft/moveRight)

>> 3) Walter mentioned that built-in array could be re-implemented using a  
>> pair of pointers instead of ptr+length. Will it ever get a green light?  
>> It fits range concept much better.
>
> Walter told me to first implement my design, and if it works, he'll do  
> the change. Yes, it does fit ranges much better because the often-used  
> next and, um, pop will only touch one word instead of two.
>
>> 4) We need some way of supporting dollar notation in user containers.  
>> The hack of using __dollar is bad (although it works).
>
> It doesn't work for multiple dimensions. There should be an  
> opDollar(uint dim) that gives the library information on which argument  
> count it occured in. Consider:
>
> auto x = matrix[$-1, $-1];
>
> Here the dollar's occurrences have different meanings. A good start  
> would be to expand the above into:
>
> auto x = matrix[matrix.opDollar(0)-1, matrix.opDollar(1)-1];
>
>> 5) I don't quite like names left and right! :) I think they should  
>> represent limits (pointers to begin and end, in case of array) rather  
>> that values. In this case, built-in arrays could be implemented as  
>> follows:
>>  struct Array(T)
>> {
>>     T* left;
>>     T* right;
>>     size_t length() { return right-left; }
>>     ref T opIndex(size_t index) { return left[index]; }
>>     // etc
>> }
>>  The rationale behind having access to range limits is to allow  
>> operations on them. For example,
>> R.left-=n;
>
> I disagree. Defining operations on range limits opens a box that would  
> make Pandora jealous:
>
> 1. What is the type of left in general? Um, let's define Iterator!(R)  
> for each range R.
>
Yes, it should be iterator.

> 2. What are the primitives of an iterator? Well, -= sounds good. How do  
> you check it for correctness? In fact, how do you check any operation of  
> a naked iterator for correctness?
>

First of all, left is a forward iterator and right is a backward iterator.
That's why left might support ++, += and = while right support --, -= and  
=.
Note that while we can rename ++ to advance() to get uniform naming.

In many cases it is desirable to store an iterator and set an iterator to  
the range bounds.

> 3. I want to play with some data. What should I use here, ranges or  
> iterators? ...
>

I don't think that ranges replace iterators (yet?). I define range as a  
pair of iterators to myself.

For example, what is the equivalent D code of the following:

for (iterator it = ..., end = list.end(); it != end; ) {
     if (predicate(*it)) {
         it = listerase(it);
     } else {
         ++it;
     }
}

Range range = vector.all;
while (!range.isEmpty) {
     if (predicate(range.left)) {
         ???
     } else {
         range.next;
     }
}

Iterator solution would be as follows:
range.left = list.erase(range.left);

I took a list for the sake of simlicity, because erasing an element  
doesn't update an end. However, in general, both left and right should be  
updated. A good solution would be to return a new range, like this:

Range range = ...;
range = container.erase(???); // but what should be here?

maybe this:
container.erase(range); // erase the whole range, no need to return  
anything because it would be empty anyway
container = container.eraseFirtsN(range, n); // erase first n elements  
 from a range, and return the rest.
container = container.eraseLastN(range, n);  // the same but for other end

I don't say that iterators magically solve everyrthing, I merely try to  
find problematic places.

> Much of the smarts of the range design is that it gets away WITHOUT  
> having to answer embarrassing questions such as the above. Ranges are  
> rock-solid, and part of them being rock-solid is that they expose enough  
> primitives to be complete, but at the same time do not expose dangerous  
> internals.
>
>> could be used instead of
>> foreach(i; 0..n) {
>>     R.pop();
>> }
>>  which is more efficient in many cases.
>
> Stop right there. That's not a primitive. It is an algorithm that gets  
> implemented in terms of a primitive. I disagree that such an algorithm  
> is an operator and does not have a name such as popN.
>
Okay, but you didn't mention popN() in the paper :)

>> Other that that - great, I like it.
>
> Thanks for your comments.
>
>
> Andrei

Thank YOU!