protocol for using InputRanges

Sat Mar 29 14:02:30 PDT 2014

On Saturday, 29 March 2014 at 01:36:46 UTC, Steven Schveighoffer 
wrote:
> On Fri, 28 Mar 2014 07:47:22 -0400, Marc Schütz 
> <schuetzm at gmx.net> wrote:
>> On Thursday, 27 March 2014 at 16:12:36 UTC, monarch_dodra 
>> wrote:
>>> If you initialized the next element in both constructor and 
>>> popFront, then you'd get rid of both these checks.
>>>
>>
>> ... but of course lose laziness.
>
> In this case, laziness is not critical. Decoding the element is 
> an O(1) operation, and when looping through, you will decode it 
> anyway.
>
> When processing a 20 element string, the cost of checking to 
> see if you have decoded on every empty or front call may 
> override the front-loaded cost of decoding the first element on 
> construction. It's sure to add to the cost if you are 
> processing all 20 elements, since you decode them all anyway.
>
> On other ranges, it's more important when the first element 
> costs a lot to fetch. HOWEVER, it's not critically important to 
> delay that unless you are not going to process that element. 
> For example, if you are foreach'ing over all the elements, the 
> delay doesn't matter.
>
> I'd rather the higher level code decide whether to delay or 
> not, depending on the situation. Requiring a protocol change 
> for such detailed knowledge seems unbalanced.

I was more thinking of the fact that you need to read something 
on construction, rather than on consumption, and this reading 
might be noticeable. There was the example of 
`stdin.byLine().filter(...)` (or something similar, don't 
remember exactly), which reads from stdin on construction. This 
changes the behaviour of the program, because the read operation 
will (probably) block.

I'd suggest to make it a requirement for ranges and algorithms 
_not_ to start consuming the underlying data until one of 
empty/front/popFront is called, even if that has a negative 
effect on performance. That's why I was asking for performance 
numbers, to see whether there even is an effect. If there isn't, 
that's just another argument for adding that requirement.

This is then, IMO, a very acceptable additional burden to place 
on the writers of ranges. I agree, however, that it's not a good 
idea to change the range protocol, i.e. what _users_ of ranges 
have to abide by. That would be a breaking change, and it would 
be an especially bad one because there I see no way to detect 
that a user failed to call `empty` in an iteration if they knew 
that there are more elements available.