ideas about ranges

Steven Schveighoffer schveiguy at yahoo.com
Thu May 21 13:06:00 PDT 2009


The thread discussing what to do for input ranges vs. forward ranges got  
me thinking.

The range concept may be defined backwards in terms of which is more  
specialized.  Consider that an input range is always usable as a stream.   
But a stream is not easy to use as an input range (the range primitive).

Case in point, a file.  To fit into the most primitive range concept, it  
must define 3 functions:

front()
popFront()
empty()

empty is easy, it's just "am I at end of file"

But front is not so easy.  In order to know what's at the front, you need  
to read it.  And at that point you've altered the underlying file.

Look at the implementation of front and popFront for a possible FileByChar  
implementation:

dchar front()
{
   if(!bufferValid)
     popFront();
   return buffer;
}

void popFront()
{
    // read buffer from source, setting bufferValid if the range isn't  
empty by default
}

What sucks about this is, we have to introduce a buffer in the range, just  
because we can't look at data until we've popped it.  Not only that, but  
calling front a file before anything is read requires a check to fill the  
buffer in case we haven't read anything yet.  This could be alleviated by  
filling in the constructor, but it's still more complex than necessary.   
Consider also that the underlying stream might already be buffered, so we  
are buffering a buffer.

And finally, if you copy such a range, the buffer might be copied while  
the stream itself may not.  this could result in strange garbage data.

But since the primitives for input range are set by the compiler (it uses  
them to do foreach), we have to implement them to make our stream ranges  
friendly to foreach.

Round peg, meet square hole.

But what are the true requirements for iteration using foreach?

1. check if there's anything left
2. get the next element

Step 2 now is split into popFront and front.  So a foreach loop is a  
rewritten for loop like this:

foreach(x; range)
{
   ...
}

translates to:
{
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.front();
     _r.popFront();
     ...
   }
}

What if step 2 was one function?  Call it popNext(), and make it  
equivalent to calling _r.front() and popFront() in one step on ranges that  
implement that method.

How does this work with foreach?

{
   auto _r = range;
   while(!_r.empty)
   {
     auto x = _r.popNext();
     ...
   }
}

Basically, the same code, one less line.

Consider that any range defined today with front() and popFront() can  
implement popNext (and popNext could be an external function if we can get  
3015 resolved).

So what I think we may need is a different range primitive:

An iterable range defines: (name to be decided)

bool empty()
T popNext()

An input range is an iterable range that also defines:

T front();
popFront();

Now look at our FileByChar example as an iterable range:

T popNext()
{
    return source.get(); // no buffering required
}

And it works perfectly with the new foreach requirements.

And it correctly doesn't work with algorithms that require front and  
popFront.

-Steve



More information about the Digitalmars-d mailing list