Strange counter-performance in an alternative `decimalLength9` function

Bruce Carneal bcarneal at gmail.com
Thu Feb 27 03:58:15 UTC 2020


On Wednesday, 26 February 2020 at 23:09:34 UTC, Basile B. wrote:
> On Wednesday, 26 February 2020 at 20:44:31 UTC, Bruce Carneal 
> wrote:
>>
>> After shuffling the input, branchless wins by 2.4X (240%).
>
> I've replaced the input by the front of a rndGen (that pops for 
> count times and starting with a custom seed) and I never see 
> the decimalLength9_3 (which seems to be the closest to the 
> original in term of performances) doing better.

The uniform random uint distribution yields a highly skewed 
digits distribution.  Note, for example, that only 10 of the 
2**32 possible outputs from a uint PRNG can be represented by a 
single decimal digit.

Your current winner will be very hard to beat if the inputs are 
uniform random.  That implementation's high-to-low cascade of 
early exit ifs aligns with PRNG output.

If you aren't sure what the input distribution is, or want 
guaranteed good worst case behavior, then branchless may be a 
better way to go.

>
> Maybe you talked about another implementation of decimalLength9 
> ?

Yes.  It's one I wrote after I saw your post. Psuedo-code here:

   auto d9_branchless(uint v) { return 1 + (v >= 10) + (v >= 100) 
... }

Using ldc to target an x86 with the above yields a series of 
cmpl, seta instruction pairs in the function body followed by a 
summing and a return.  No branching.

Let me know if the above is unclear or insufficient.























More information about the Digitalmars-d-learn mailing list