Strange counter-performance in an alternative `decimalLength9` function

Basile B. b2.temp at gmx.com
Thu Feb 27 09:54:48 UTC 2020


On Thursday, 27 February 2020 at 09:41:20 UTC, Basile B. wrote:
> On Thursday, 27 February 2020 at 09:33:28 UTC, Dennis Cote 
> wrote:
>> [...]
>
> Sorry but no. I think that you have missed how this has changed 
> since the first message.
> 1. the way it was tested initially was wrong because LLVM was 
> optimizing some stuff in some tests and not others, due to 
> literals constants.
> 2. Apparently there would be a branchless version that's fast 
> when testing with unbiased input (to be verified)
>
> this version is:
>
> ---
> ubyte decimalLength9_4(const uint v) pure nothrow
> {
>     return 1 +  (v >= 10) +
>                 (v >= 100) +
>                 (v >= 1000) +
>                 (v >= 10000) +
>                 (v >= 100000) +
>                 (v >= 1000000) +
>                 (v >= 10000000) +
>                 (v >= 100000000) ;
> }
> ---
>
> but i cannot see the improvment when use time on the test 
> program and 100000000 calls feeded with a random number.
>
> see 
> https://forum.dlang.org/post/ctidwrnxvwwkouprjszw@forum.dlang.org for the latest evolution of the discussion.

maybe just add you version to the test program and run

time ./declen -c100000000 -f0 -s137 // original
time ./declen -c100000000 -f4 -s137 // the 100% branchless
time ./declen -c100000000 -f5 -s137 // the LUT + branchless for 
the bit num that need attention
time ./declen -c100000000 -f6 -s137 // assumed to be your version

to see if it beats the original. Thing is that i cannot do it 
right now but otherwise will try tomorrow.


More information about the Digitalmars-d-learn mailing list