value range propagation for _bitwise_ OR

Clemens eriatarka84 at gmail.com
Tue Apr 13 08:10:24 PDT 2010


Adam Ruppe Wrote:

> Jerome's highbit function is the same as std.intrinsic.bsr. I wonder
> which is faster?
> 
> I ran a test, and for 100 million iterations (1..10000000), the
> intrinsic beat out his function be a mere 300 milliseconds on my box.
> - highbit ran in an average of 1897 ms and bsr did the same in an
> average if 1534.
> 
> Recompiling with -inline -O -release cuts the raw numbers about in
> half, but keeps about the same difference, leading me to think
> overhead amounts for a fair amount of the percentage instead of actual
> implementation. The new averages are 1134 and 853.

That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I guess is the reason it's an intrinsic in the first place). I would have expected this to be much, much faster than a user function. Anyone care enough to check the generated assembly?

[1] http://www.itis.mn.it/linux/quarta/x86/bsr.htm

-- Clemens



More information about the Digitalmars-d mailing list