Proposal: takeFront and takeBack

Wed Jul 4 08:17:53 PDT 2012

On 7/4/12 8:33 AM, deadalnix wrote:
> Le 04/07/2012 13:04, Roman D. Boiko a écrit :
>> On Wednesday, 4 July 2012 at 10:54:41 UTC, deadalnix wrote:
>>> I do agree with that. However, the current proposal have the same
>>> drawback but on the code that use the range.
>>>
>>> Both solutions are messy IMO.
>>
>> What is error-prone in current client code?
>>
>> If particular algorithm wants to take advantage of a potential speed-up,
>> it may decide to check whether hasConsume!Range, and call consumeFront
>> instead of front + popFront. Nobody is obliged to do so.
>>
>> The only problem I can see is potential code duplication, which is lower
>> priority in performance-critical part of code, IMO.
>>
>
> You have to maintain 2 version of you algorithm. This is more work to
> test, to maintain, and it is likely to introduce more bugs.
>
> More code == more bugs. More NPATH = more bugs and harder to test and to
> maintain.
>
> In addition, this will result in practice in less generic code, because
> one may not need the version of the algorithm without consume primitives
> in his/her own code and not code it at all.

I agree. Can't get behind this proposal.

First, it solves the problem of double decoding for UTF-8 and UTF-16 
strings. Before that, we need to estimate how bad the problem is and how 
we can optimize front/popFront() to alleviate it. For all I can tell, in 
ASCII-heavy strings we're looking at optimizing away ONE redundant 
comparison in a sequence that involves a bounds check, two word 
assignments, and another comparison. Off the top of my head I don't 
think the gains are going to be impressive.

Looking at the current implementation in std.array:

void popFront(A)(ref A a)
if (isNarrowString!A && isMutable!A && !isStaticArray!A)
{
     assert(a.length, "Attempting to popFront() past the end of an array 
of "
             ~ typeof(a[0]).stringof);
     a = a[std.utf.stride(a, 0) .. $];
}

So here we have a bounds check, a call to stride, and two word 
assignments. The implementation of stride itself is:

uint stride(S)(in S str, size_t index) @safe pure
     if (is(S : const(char[])))
{
     immutable c = str[index];
     if (c < 0x80)
         return 1;
     else
         return strideImpl(c, index);
  }

Here, even if we assume stride is inlined, we have one redundant bounds 
check that we can avoid, and redundant operations with "index" which is 
always 0.

In theory all of these redundant operations can be done away with 
optimization techniques, but probably some aren't. So we should look 
first at optimizing this flow heavily before looking at an API addition 
that has disadvantages in other dimensions.

Andrei