Strange counter-performance in an alternative `decimalLength9` function
Basile B.
b2.temp at gmx.com
Thu Feb 27 09:54:48 UTC 2020
On Thursday, 27 February 2020 at 09:41:20 UTC, Basile B. wrote:
> On Thursday, 27 February 2020 at 09:33:28 UTC, Dennis Cote
> wrote:
>> [...]
>
> Sorry but no. I think that you have missed how this has changed
> since the first message.
> 1. the way it was tested initially was wrong because LLVM was
> optimizing some stuff in some tests and not others, due to
> literals constants.
> 2. Apparently there would be a branchless version that's fast
> when testing with unbiased input (to be verified)
>
> this version is:
>
> ---
> ubyte decimalLength9_4(const uint v) pure nothrow
> {
> return 1 + (v >= 10) +
> (v >= 100) +
> (v >= 1000) +
> (v >= 10000) +
> (v >= 100000) +
> (v >= 1000000) +
> (v >= 10000000) +
> (v >= 100000000) ;
> }
> ---
>
> but i cannot see the improvment when use time on the test
> program and 100000000 calls feeded with a random number.
>
> see
> https://forum.dlang.org/post/ctidwrnxvwwkouprjszw@forum.dlang.org for the latest evolution of the discussion.
maybe just add you version to the test program and run
time ./declen -c100000000 -f0 -s137 // original
time ./declen -c100000000 -f4 -s137 // the 100% branchless
time ./declen -c100000000 -f5 -s137 // the LUT + branchless for
the bit num that need attention
time ./declen -c100000000 -f6 -s137 // assumed to be your version
to see if it beats the original. Thing is that i cannot do it
right now but otherwise will try tomorrow.
More information about the Digitalmars-d-learn
mailing list