protocol for using InputRanges

w0rp devw0rp at gmail.com
Fri Mar 28 05:54:20 PDT 2014


I think something has gone wrong in this discussion somewhere. It 
is like this.

Do not assume .empty is const, this is unreasonable.

Do not assume .front is const, this is unreasonable.

.popFront has the only high-level observable difference in 
removing something from the front of an InputRange.

I say it is unreasonable to expect .empty or .front to be const 
because a range which is strictly an InputRange (not a 
ForwardRange...) is likely to be doing some kind of work where 
stepping through its sequence of data is destructive. As soon as 
the thing is read in the implementation of the InputRange, you 
can't read that thing again. You've cached it or it is gone. Thus 
.empty and .front must be non-const and lazily evaluate caching 
the next value.

I do not see it as a major issue to check .empty or call .front 
many times, even with this caching behaviour. Commonly this is as 
simple as checking whether or not you still have a value to 
provide inside the InputRange. We use assert(!empty) etc. because 
honestly I cannot imagine another way to make this safe.

If you really, really hate this double-checking in .empty or 
.front, you could try to change InputRanges so they behave like 
Java Iterators. I believe this is a really bad idea. You have to 
start thinking about returning nullable types. You have to push 
the job of caching values on to users of the InputRanges, as in 
each and every algorithm. It's just a really bad choice, and with 
the kind of boxing you'll need to do for nullable types, you'll 
surely end off worse as far as performance goes anyway.

Really, the range protocol is as Walter specified at the start of 
this thread. The documentation should reflect how .empty or 
.front may result in caching the next value. I really don't think 
this is a major performance problem worth worrying about, though. 
If it's really, really worth it for certain standard algorithms, 
perhaps something can be done in private range methods or similar 
to offer some kind of speed boost. ('package' privacy for std.* 
is feasible.) I don't think there is a lot to be gained from this 
in general, though.

Streams are different, and may be more appropriate for some 
functions for performance reasons. I still prefer ranges for code 
where performance is less critical, because if you aren't very 
worried about performance, the range API is a lot nicer. I'm not 
dead set on maximum performance if it makes life a lot harder in 
my code, personally. I view these kinds of improvements as 
premature optimisations if I'm writing something new.


More information about the Digitalmars-d mailing list