Ranges, Performance, and opApply
Kapps at NotValidEmail.com
Sun Oct 30 01:02:35 PDT 2011
As it stands right now, ranges are quite inefficient. Phobos is designed
with an emphasis on ranges, including in performance-sensitive code such
as std.algorithm. No matter how optimized the range is, it will always
be slower than the equivalent opApply function (and, as far as I can
tell, offers no benefit for just iteration).
The fastest way of iterating over anything currently, is foreach over an
array. It beats even for(size_t i 0 to length-1) by a decent bit. The
lowest level range for all ranges is almost always an array (that is, if
you have a range using a range ... using a range, it is likely that the
last range is an array or uses an array). By implementing an opApply for
these ranges, that calls opApply on the wrapped range or wrapped array
(or other), you can basically double the performance cost of iterating
(for many iterations over a small body). It is not possible to make the
compiler-generated opApply make this optimization, as it does not know
what the underlying data for the range is.
Creating an opApply can be done on a per-range basis for ranges that are
used in performance-sensitive code (such as countUntil). If the manual
opApply foreach's over the input range, and the input range has a
similar manual opApply, there are fair performance benefits. If the
underlying range does not have a manual opApply, there will still be
some performance benefits, and does not break code in any way. If all
ranges being used have opApply and the last range is an array, then
performance benefits are very significant.
As an example, I made a test of 5 different ways of implementing
std.algorithm.filter, and two tests of another filter on top of the
output from the first one, one with a manual opApply, and one without.
Basically, a manual opApply was over twice as fast. The code and results
can be found at http://pastie.org/2781723. Ultimately, nothing is as
fast as just doing a foreach yourself, but that's to be expected. It can
be however, be easily made that the difference is not as extreme as it
So, should Phobos ranges attempt to have a manual opApply where feasibly
being used in performance sensitive code?
More information about the Digitalmars-d