Can you shrink it further?

Stefan Koch via Digitalmars-d digitalmars-d at puremagic.com
Mon Oct 10 21:05:47 PDT 2016


On Tuesday, 11 October 2016 at 03:58:59 UTC, Andrei Alexandrescu 
wrote:
> On 10/10/16 11:00 PM, Stefan Koch wrote:
>> On Tuesday, 11 October 2016 at 02:48:22 UTC, Andrei 
>> Alexandrescu wrote:
>>> That looks good. I'm just worried about the jump forward - 
>>> ideally the
>>> case c < 127 would simply entail a quick return. I tried a 
>>> fix, but it
>>> didn't do what I wanted in ldc. We shouldn't assert(0) if 
>>> wrong - just
>>> skip one byte. Also, are we right to not worry about 5- and 
>>> 6-byte
>>> sequences? The docs keep on threatening with it, and then 
>>> immediately
>>> mention those are not valid.
>>>
>>> [ ... ]
>>>
>>> Andrei
>>>
>>
>> If you want to skip a byte it's easy to do as well.
>>
>> void popFront3(ref char[] s) @trusted pure nothrow {
>>    immutable c = s[0];
>>    uint char_length = 1;
>>    if (c < 127)
>>    {
>>    Lend :
>>      s = s.ptr[char_length .. s.length];
>>    } else {
>>      if ((c & b01100_0000) == 0b1000_0000)
>>      {
>>        //just skip one in case this is not the beginning of a 
>> code-point
>> char
>>        goto Lend;
>>      }
>>      if (c < 192)
>>      {
>>        char_length = 2;
>>        goto Lend;
>>      }
>>      if (c < 240)
>>      {
>>        char_length = 3;
>>        goto Lend;
>>      }
>>      if (c < 248)
>>      {
>>        char_length = 4;
>>        goto Lend;
>>      }
>>    }
>>  }
>>
>
> Affirmative. That's identical to the code in "[ ... ]" :o). 
> Generated code still does a jmp forward though. -- Andrei

It was not identical.
((c & b01100_0000) == 0b1000_0000))
Can be true in all of the 3 following cases.
If we do not do a jmp to return here, we cannot guarantee that we 
will not skip over the next valid char.
Thereby corrupting already corrupt strings even more.

For best performance we need to leave the gotos in there.



More information about the Digitalmars-d mailing list