Ranges and/versus iterators

Steven Schveighoffer schveiguy at yahoo.com
Tue Mar 23 19:12:37 PDT 2010


Andrei Alexandrescu Wrote:

> On 03/23/2010 03:46 PM, Steven Schveighoffer wrote:
> > A while back, you identified one of the best interfaces for input ranges:
> >
> > E* getNext();
> >
> > Which allows for null returns when no data is left. The drawback is that
> > E must be either referenced or allocated on the heap (providing storage
> > to the function is an option). But the killer issue was that safeD would
> > not allow it. However, in recent times, you have hinted that safeD may
> > allow pointers, but disallow bad pointer operations. In light of this,
> > can we reconsider this interface, or other alternatives using pointers?
> >
> > I've always felt that if we were to define ranges for streams in a
> > non-awkward way, we would need an "all in one" operation, since not only
> > does getting data from the range move the range, but checking for empty
> > might also move the range (empty on a stream means you tried to read and
> > got nothing).
> 
> I'd gladly reconsider E* getNext(), and I like it a lot, but that 
> doesn't accommodate ranges that want to return rvalues without storing 
> them (e.g. a range using getchar() as a back-end, and generally streams 
> that don't correspond to stuff stored in memory). If it's not in memory, 
> there's no pointer to it.

First, a range backed by getchar is about as useful as functional qsort ;)

Second, you *have* to read data into memory.  Even with the ranges as they currently are, you have to read into memory.  At least this is less awkward.

Take for instance a line iterator.  You have to read enough to see the line terminator, but you most likely do not read *exactly* to the line terminator, so you just read in chunks until you get a line, then return the pointer to the data.  It works actually quite elegantly.

Third, the memory could be supplied by the caller.  For instance, if you wrote the function like this:

E* getNext(E* buf = null);

Then foreach could do something like this:

foreach(e; streamrange)

=>

E _e;
while(auto e = streamrange.getNext(&_e))

To avoid heap allocation.  Of course, heap allocation would be the default if buf is null.

Tango does this sort of trick quite often, and it makes the I/O code extremely fast.

Also, another thing to think about is we can generalize the return type to satisfying the condition:

iff range is empty then cast(bool)range.getNext == false.

This means as long as your range cannot return a null element for a non-empty return, it is OK not to use a pointer.  For example, the line iterator again... it can be written like:

const(char)[] getNext()

because you will only ever return a null const(char)[] when there is no data left.

I don't think we should give up on trying to make a stream range that is not awkward, I really dislike the way today's input ranges map to streams.

-Steve



More information about the Digitalmars-d mailing list