Can you shrink it further?
Matthias Bentrup via Digitalmars-d
digitalmars-d at puremagic.com
Tue Oct 11 00:30:26 PDT 2016
On Tuesday, 11 October 2016 at 04:05:47 UTC, Stefan Koch wrote:
> On Tuesday, 11 October 2016 at 03:58:59 UTC, Andrei
> Alexandrescu wrote:
>> On 10/10/16 11:00 PM, Stefan Koch wrote:
>>> On Tuesday, 11 October 2016 at 02:48:22 UTC, Andrei
>>> Alexandrescu wrote:
>>>>[...]
>>>
>>> If you want to skip a byte it's easy to do as well.
>>>
>>> void popFront3(ref char[] s) @trusted pure nothrow {
>>> immutable c = s[0];
>>> uint char_length = 1;
>>> if (c < 127)
>>> {
>>> Lend :
>>> s = s.ptr[char_length .. s.length];
>>> } else {
>>> if ((c & b01100_0000) == 0b1000_0000)
>>> {
>>> //just skip one in case this is not the beginning of a
>>> code-point
>>> char
>>> goto Lend;
>>> }
>>> if (c < 192)
>>> {
>>> char_length = 2;
>>> goto Lend;
>>> }
>>> if (c < 240)
>>> {
>>> char_length = 3;
>>> goto Lend;
>>> }
>>> if (c < 248)
>>> {
>>> char_length = 4;
>>> goto Lend;
>>> }
>>> }
>>> }
>>>
>>
>> Affirmative. That's identical to the code in "[ ... ]" :o).
>> Generated code still does a jmp forward though. -- Andrei
>
> It was not identical.
> ((c & b01100_0000) == 0b1000_0000))
> Can be true in all of the 3 following cases.
> If we do not do a jmp to return here, we cannot guarantee that
> we will not skip over the next valid char.
> Thereby corrupting already corrupt strings even more.
>
> For best performance we need to leave the gotos in there.
A branch-free version:
void popFront4(ref char[] s) @trusted pure nothrow {
immutable c = s[0];
uint char_length = 1 + (c >= 192) + (c >= 240) + (c >= 248);
s = s.ptr[char_length .. s.length];
}
Theoretically the char_length could be computed with three sub
and addc instructions, but no compiler is smart enough to detect
that.
More information about the Digitalmars-d
mailing list