Proposal: takeFront and takeBack

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Wed Jul 4 21:04:03 PDT 2012


On 7/4/12 10:11 PM, Jonathan M Davis wrote:
> On Wednesday, July 04, 2012 21:33:31 Andrei Alexandrescu wrote:
>> Great. Could you please post some code so we play with it? Thanks.
>
> Okay. You can use this:
[snip]

Thanks. I made the following change to popFront:

@trusted void popFront(A)(ref A a)
if (isNarrowString!A && isMutable!A && !isStaticArray!A)
{
     assert(a.length, "Attempting to popFront() past the end of an array 
of "
             ~ typeof(a[0]).stringof);
     immutable c = a[0];
     if (c < 0x80)
     {
         a = a.ptr[1 .. a.length];
     }
     else
     {
         import core.bitop;
         immutable msbs = 7 - bsr(~c);
         if ((msbs >= 2) & (msbs <= 6))
         {
             a = a[msbs .. $];
         }
         else
         {
             //throw new UTFException("Invalid UTF-8 sequence", 0);
         }
     }
}

For some reason, uncommenting the throwing code makes the function 
significantly slower. That seems to be an issue with the compiler 
because putting the throw in a function seems to restore speed.

With the above, I get on a Mac:

ascii 126.61%: old [682 ms, 479 μs, and 3 hnsecs], new [864 ms, 102 μs, 
and 1 hnsec]
uni   86.76%:  old [1 sec, 888 ms, 17 μs, and 8 hnsecs], new [1 sec, 638 
ms, 76 μs, and 3 hnsecs]

So the ascii string handling became actually 27% faster whereas the uni 
string handling is 13% slower.

It might be argued that checking for validity is not the metier of 
popFront; only if you do try to use stuff (e.g. by calling front) should 
one see exceptions. If popFront sees incorrect characters, it should 
just skip them one at a time. Following that argument, the 
implementation may be:

@trusted void popFront(A)(ref A a)
if (isNarrowString!A && isMutable!A && !isStaticArray!A)
{
     assert(a.length, "Attempting to popFront() past the end of an array 
of "
             ~ typeof(a[0]).stringof);
     immutable c = a[0];
     if (c < 0x80)
     {
         a = a.ptr[1 .. a.length];
     }
     else
     {
         import core.bitop;
         auto msbs = 7 - bsr(~c);
         if ((msbs < 2) | (msbs > 6))
         {
             msbs = 1;
         }
         a = a[msbs .. $];
     }
}

With this code I get:

ascii 115.39%: old [744 ms, 103 μs, and 6 hnsecs], new [858 ms, 628 μs, 
and 4 hnsecs]
uni   96.78%:  old [1 sec, 877 ms, and 461 μs], new [1 sec, 817 ms, 14 
μs, and 3 hnsecs]


Andrei


More information about the Digitalmars-d mailing list