Can you shrink it further?
Ethan Watson via Digitalmars-d
digitalmars-d at puremagic.com
Tue Oct 11 07:04:30 PDT 2016
On Tuesday, 11 October 2016 at 10:01:41 UTC, Stefan Koch wrote:
> On Tuesday, 11 October 2016 at 09:45:11 UTC, Temtaime wrote:
>>
>> Sorry this was also a type in the code.
>>
>> void popFront7(ref char[] s) @trusted pure nothrow
>> {
>> import core.bitop;
>> auto v = 7 - bsr(~s[0] | 1);
>> s = s[v > 6 ? 1 : (v ? (v > s.length ? s.length : v) :
>> 1)..$];
>> }
>>
>> Please check this.
>
> 162 us
The branching, it hurts my eyes!
Something like the following should give correct (assuming I
haven't written bad logic) branchless results with
architecture-optimised max calls. Note that the minus/plus 1
operation on the third line will ensure with the sign
multiplication that values of 7 will map to 1, whereas for all
other values it's an extra operation. But the advantage is that
you're not sticking three branches in close proximity to each
other, so you will never get a branch predictor fail. (Of note,
any performance test for these functions should test with data
designed to fail the branching code I quoted, keeping in mind
that desktop Intel processors have a four-state branch predictor.
I've not performance tested it myself, but this will certainly
run faster on the AMD Jaguar processors than a version with
branching checks.)
int v = 7 - bsr( ~s[0] | 1 );
int sign = ( (v - 7) >> 31 );
v = ( v - 1 ) * sign + 1;
str = str[ min( v, s.length ) .. $ ];
More information about the Digitalmars-d
mailing list