ideas about ranges
Steven Schveighoffer
schveiguy at yahoo.com
Thu May 21 13:06:00 PDT 2009
The thread discussing what to do for input ranges vs. forward ranges got
me thinking.
The range concept may be defined backwards in terms of which is more
specialized. Consider that an input range is always usable as a stream.
But a stream is not easy to use as an input range (the range primitive).
Case in point, a file. To fit into the most primitive range concept, it
must define 3 functions:
front()
popFront()
empty()
empty is easy, it's just "am I at end of file"
But front is not so easy. In order to know what's at the front, you need
to read it. And at that point you've altered the underlying file.
Look at the implementation of front and popFront for a possible FileByChar
implementation:
dchar front()
{
if(!bufferValid)
popFront();
return buffer;
}
void popFront()
{
// read buffer from source, setting bufferValid if the range isn't
empty by default
}
What sucks about this is, we have to introduce a buffer in the range, just
because we can't look at data until we've popped it. Not only that, but
calling front a file before anything is read requires a check to fill the
buffer in case we haven't read anything yet. This could be alleviated by
filling in the constructor, but it's still more complex than necessary.
Consider also that the underlying stream might already be buffered, so we
are buffering a buffer.
And finally, if you copy such a range, the buffer might be copied while
the stream itself may not. this could result in strange garbage data.
But since the primitives for input range are set by the compiler (it uses
them to do foreach), we have to implement them to make our stream ranges
friendly to foreach.
Round peg, meet square hole.
But what are the true requirements for iteration using foreach?
1. check if there's anything left
2. get the next element
Step 2 now is split into popFront and front. So a foreach loop is a
rewritten for loop like this:
foreach(x; range)
{
...
}
translates to:
{
auto _r = range;
while(!_r.empty)
{
auto x = _r.front();
_r.popFront();
...
}
}
What if step 2 was one function? Call it popNext(), and make it
equivalent to calling _r.front() and popFront() in one step on ranges that
implement that method.
How does this work with foreach?
{
auto _r = range;
while(!_r.empty)
{
auto x = _r.popNext();
...
}
}
Basically, the same code, one less line.
Consider that any range defined today with front() and popFront() can
implement popNext (and popNext could be an external function if we can get
3015 resolved).
So what I think we may need is a different range primitive:
An iterable range defines: (name to be decided)
bool empty()
T popNext()
An input range is an iterable range that also defines:
T front();
popFront();
Now look at our FileByChar example as an iterable range:
T popNext()
{
return source.get(); // no buffering required
}
And it works perfectly with the new foreach requirements.
And it correctly doesn't work with algorithms that require front and
popFront.
-Steve
More information about the Digitalmars-d
mailing list