protocol for using InputRanges
w0rp
devw0rp at gmail.com
Fri Mar 28 05:54:20 PDT 2014
I think something has gone wrong in this discussion somewhere. It
is like this.
Do not assume .empty is const, this is unreasonable.
Do not assume .front is const, this is unreasonable.
.popFront has the only high-level observable difference in
removing something from the front of an InputRange.
I say it is unreasonable to expect .empty or .front to be const
because a range which is strictly an InputRange (not a
ForwardRange...) is likely to be doing some kind of work where
stepping through its sequence of data is destructive. As soon as
the thing is read in the implementation of the InputRange, you
can't read that thing again. You've cached it or it is gone. Thus
.empty and .front must be non-const and lazily evaluate caching
the next value.
I do not see it as a major issue to check .empty or call .front
many times, even with this caching behaviour. Commonly this is as
simple as checking whether or not you still have a value to
provide inside the InputRange. We use assert(!empty) etc. because
honestly I cannot imagine another way to make this safe.
If you really, really hate this double-checking in .empty or
.front, you could try to change InputRanges so they behave like
Java Iterators. I believe this is a really bad idea. You have to
start thinking about returning nullable types. You have to push
the job of caching values on to users of the InputRanges, as in
each and every algorithm. It's just a really bad choice, and with
the kind of boxing you'll need to do for nullable types, you'll
surely end off worse as far as performance goes anyway.
Really, the range protocol is as Walter specified at the start of
this thread. The documentation should reflect how .empty or
.front may result in caching the next value. I really don't think
this is a major performance problem worth worrying about, though.
If it's really, really worth it for certain standard algorithms,
perhaps something can be done in private range methods or similar
to offer some kind of speed boost. ('package' privacy for std.*
is feasible.) I don't think there is a lot to be gained from this
in general, though.
Streams are different, and may be more appropriate for some
functions for performance reasons. I still prefer ranges for code
where performance is less critical, because if you aren't very
worried about performance, the range API is a lot nicer. I'm not
dead set on maximum performance if it makes life a lot harder in
my code, personally. I view these kinds of improvements as
premature optimisations if I'm writing something new.
More information about the Digitalmars-d
mailing list