protocol for using InputRanges

Tue Mar 25 13:15:32 PDT 2014

It's pretty clear that:

1. the protocol is COMPLETELY undocumented and undefined.

2. in this vacuum, people (including me) have made all sorts of assumptions, and 
assumed those assumptions were universal and shared.

Since ranges+algorithms are central to D, this is a very serious problem. There 
are two competing schools of thought:

1. maximum flexibility

2. maximum performance

(1) has been explained well in this thread already. (2) not so much, so I'll 
stand up for that one.

I've been recently involved in writing high performance apps, as anyone 
following ScopeBuffer knows. With such apps, it's clear that:

1. size matters - the fewer registers an object fits in, the faster it goes
2. every instruction matters - more operations == slower code
3. lazy seems to be faster, mainly because it eliminates the storage management 
cost of buffers and extra copying into and out of those buffers

We want to appeal to the high performance coders. To maximize performance, 
ranges should be optimized to the inner loop case, which is:

     while (!r.empty) { auto e = r.front; ... e ...; r.popFront(); }

With the additional proviso that ranges are passed to algorithms by value, so 
they should be cheap to copy. Cheap to copy usually means them being small.

A) I know that my range is not empty, so I can skip calling empty.

Since front is guaranteed to succeed if !empty, this puts a requirement on many 
ranges that they have a non-trivial constructor that 'primes the pump'. Of 
course, priming may fail, and so construction may throw, which is not good 
design. Next, priming requires a buffer in the range. Buffers add size, making 
them slower to copy, and will often require another field saying if the buffer 
has data in it or not, further bumping the size. And lastly, it means that lazy 
ranges will be required to read the first element, even if the range isn't then 
subsequently used, which defeats what one would expect from a lazy range. By 
having mere construction of a range initiate reading from source, this makes the 
useful idiom of separating construction from use impractical. (For example, 
building a pipeline object, then sending it to another thread for execution.)

All this saves for the user is one call to empty for the entire algorithm, at a 
cost incurred with every iteration. I.e. it selects O(n) to save O(1).

B) I want to be able to call front multiple times in a row, and expect to get 
the same value.

This can force the range to incorporate a one element buffer and a flag 
indicating the status of that buffer. The range instance gets bigger and more 
expensive to copy, and the cost of manipulating the flag and the buffer is added 
to every loop iteration. Note that the user of a range can trivially use:
      auto e = r.front;
      ... using e multiple times ...
instead.

The big problem with (A) and (B) is these costs become present in every range, 
despite being rarely needed and having simple alternatives. This is the wrong 
place to put cost and complexity. The user should be making these decisions 
based on his needs.

Hence, I propose that the protocol for using input ranges be defined as:

     while (!r.empty) { auto e = r.front; ... e ...; r.popFront(); }

This makes it possible to build pipelines that are firehoses with no kinks or 
constrictions in them. It optimizes for the inner loop case, not boundary cases.