Can you shrink it further?

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Mon Oct 10 19:48:22 PDT 2016


That looks good. I'm just worried about the jump forward - ideally the 
case c < 127 would simply entail a quick return. I tried a fix, but it 
didn't do what I wanted in ldc. We shouldn't assert(0) if wrong - just 
skip one byte. Also, are we right to not worry about 5- and 6-byte 
sequences? The docs keep on threatening with it, and then immediately 
mention those are not valid.

void popFront3(ref char[] s) @trusted pure nothrow {
   immutable c = s[0];
   uint char_length = 1;
   if (c < 127)
   {
   Lend :
     s = s.ptr[char_length .. s.length];
   } else {
     if (c < 192)
     {
       char_length = 2;
       goto Lend;
     }
     if (c < 240) {
       char_length = 3;
       goto Lend;
     }
     if (c < 248) {
        char_length = 4;
     }
     goto Lend;
   }
}


Andrei

On 10/10/16 9:39 PM, Stefan Koch wrote:
> This version has 24 instructions but these have a smaller encoding then
> and are generally inexpensive
> With inline asm and conditional moves it would be possible to reduce
> this even further
> to ~20 instructions.
>
> void popFront1(ref char[] s) @trusted pure nothrow {
>   immutable c = s[0];
>   size_t char_length = 1;
>   if (c < 127)
>   {
>     goto Lend;
>   } else {
>     if ((c & 0b1100_0000) == 0b1000_0000)
>     {
>       // This is invalid as a first char
>       goto Lerror;
>     }
>     if (c < 192)
>     {
>       char_length = 2;
>       goto Lend;
>     }
>     if (c < 240) {
>       char_length = 3;
>       goto Lend;
>     }
>     if (c < 248) {
>        char_length = 4;
>       goto Lend;
>     }
>
>     //These characters are also no longer valid
>     Lerror : assert(0);
>   }
>   Lend :
>   s = s.ptr[char_length .. s.length];
> }
>



More information about the Digitalmars-d mailing list