RFC on range design for D2

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Sep 9 10:38:41 PDT 2008


superdan wrote:
> Andrei Alexandrescu Wrote:
> 
>> What we want is a design that tells the truth. And a design that tells
>> the truth is this:
>>
>> r.isEmpty does whatever the hell it takes to make sure whether there's
>> data available or not. It is not const and it could throw an exception.
>>
>> v = r.getNext returns BY VALUE by means of DESTRUCTIVE COPY data that
>> came through the wire, data that the client now owns as soon as getNext
>> returned. There is no extra copy, no extra allocation, and the real
>> thing has happened: data has been read from the outside and user code
>> was made the only owner of it.
> 
> this is really kool n the gang. there's a sore point tho. if i wanna read strings from a file no prob.
> 
> for (auto r = stringSucker(stdin); !r.isEmpty(); )
> {
>     string s = r.getNext();
>     // play with s
> }
> 
> but a new string is allocated every line. that's safe but slow. so i want some more efficient stuff. i should use char[] because string don't deallocate.
> 
> for (auto r = charArraySucker(stdin); !r.isEmpty(); )
> {
>     char[] s = r.getNext();
>     // play with s
> }
> 
> no improvement. same thing a new char[] is alloc each time. maybe i could do
> 
> for (auto r = charArraySucker(stdin); !r.isEmpty(); )
> {
>     char[] s = r.getNext();
>     // play with s
>     delete s;
> }
> 
> would this make stuff faster. maybe. maybe not. and it's not general. what i'd like is some way of telling the range, i'm done with this you can recycle and reuse it. it's a green green world.
> 
> for (auto r = charArraySucker(stdin); !r.isEmpty(); )
> {
>     char[] s = r.getNext();
>     // play with s
>     r.recycle(s);
> }
> 
> sig is recycle(ref ElementType!(R)).

This is a great point. Unfortunately that won't quite work properly. 
Equally unfortunately, you just revealed an issue with my design.

One goal of range design is that algorithms written for "inferior" 
ranges should work seamlessly with "superior" ranges. For example, find 
should work fine with any range, so we should write (with your idea):

R find(R, V)(R r, V v)
{
     ElementType!(R) tmp;
     for (; !r.isEmpty; r.recycle(tmp))
     {
         tmp = r.getNext;
         if (tmp == v) break;
     }
     return r;
}

This looks great but imagine we apply this to a collection of some 
costly ElementType!(R). Then getNext rightly returns a reference because 
there's no need to create a copy unless absolutely necessary. But then 
we realize that our look forces a copy no matter what! The copy was good 
for the input range. But it's bad for the actual container that wouldn't 
need to copy anything.

It looks like I need to reconsider the design mentioned by Lionello, in 
which both input iterators and forward iterators expose separate 
isEmpty/next/first operations. Then an input range r caches the last 
read value internally and gives access to it through r.first.

I have an idea on how to disallow at least egregious errors. Consider:

void fill(R, V)(R r, V v)
{
     for (; !r.isEmpty; r.next) r.first = v;
}

If "r.first = v;" compiles, then the following scandalous misuse goes 
unpunished:

fill(charArraySucker, "bogus");

A way to prevent r.first = v from compiling is to define a one-argument 
function first:

struct MyRange(T)
{
     private void first(W)(W whatever); // not implemented
     ref T first();
}

Expression r.first will resolve to the second function. Expression 
r.first = v will translate into r.first(v) so it will resolve to the 
first function, which will fail the protection test.

Then there remains the problem:

void bump(R, V)(R r)
{
     for (; !r.isEmpty; r.next) ++r.first;
}

bump(charArraySucker); // bogus

Sigh...


Andrei


More information about the Digitalmars-d-announce mailing list