Ranges and/versus iterators
Steven Schveighoffer
schveiguy at yahoo.com
Tue Mar 23 19:12:37 PDT 2010
Andrei Alexandrescu Wrote:
> On 03/23/2010 03:46 PM, Steven Schveighoffer wrote:
> > A while back, you identified one of the best interfaces for input ranges:
> >
> > E* getNext();
> >
> > Which allows for null returns when no data is left. The drawback is that
> > E must be either referenced or allocated on the heap (providing storage
> > to the function is an option). But the killer issue was that safeD would
> > not allow it. However, in recent times, you have hinted that safeD may
> > allow pointers, but disallow bad pointer operations. In light of this,
> > can we reconsider this interface, or other alternatives using pointers?
> >
> > I've always felt that if we were to define ranges for streams in a
> > non-awkward way, we would need an "all in one" operation, since not only
> > does getting data from the range move the range, but checking for empty
> > might also move the range (empty on a stream means you tried to read and
> > got nothing).
>
> I'd gladly reconsider E* getNext(), and I like it a lot, but that
> doesn't accommodate ranges that want to return rvalues without storing
> them (e.g. a range using getchar() as a back-end, and generally streams
> that don't correspond to stuff stored in memory). If it's not in memory,
> there's no pointer to it.
First, a range backed by getchar is about as useful as functional qsort ;)
Second, you *have* to read data into memory. Even with the ranges as they currently are, you have to read into memory. At least this is less awkward.
Take for instance a line iterator. You have to read enough to see the line terminator, but you most likely do not read *exactly* to the line terminator, so you just read in chunks until you get a line, then return the pointer to the data. It works actually quite elegantly.
Third, the memory could be supplied by the caller. For instance, if you wrote the function like this:
E* getNext(E* buf = null);
Then foreach could do something like this:
foreach(e; streamrange)
=>
E _e;
while(auto e = streamrange.getNext(&_e))
To avoid heap allocation. Of course, heap allocation would be the default if buf is null.
Tango does this sort of trick quite often, and it makes the I/O code extremely fast.
Also, another thing to think about is we can generalize the return type to satisfying the condition:
iff range is empty then cast(bool)range.getNext == false.
This means as long as your range cannot return a null element for a non-empty return, it is OK not to use a pointer. For example, the line iterator again... it can be written like:
const(char)[] getNext()
because you will only ever return a null const(char)[] when there is no data left.
I don't think we should give up on trying to make a stream range that is not awkward, I really dislike the way today's input ranges map to streams.
-Steve
More information about the Digitalmars-d
mailing list