Counting bits in a ulong

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Jun 4 17:38:48 UTC 2020


On Thu, Jun 04, 2020 at 06:40:25AM -0700, H. S. Teoh via Digitalmars-d wrote:
> On Thu, Jun 04, 2020 at 01:27:39AM +0000, Paul Backus via Digitalmars-d wrote:
> [...]
> > Interesting stuff. Did you compare your implementations with the
> > popcnt function in core.bitop? There could be a potential improvement
> > here.
> 
> On Thu, Jun 04, 2020 at 07:42:07AM +0000, Patrick Schluter via Digitalmars-d wrote:
> [...]
> > You should try to use the popcnt instruction of the cpu (in asm or
> > with intrinsics). It's at least 3 or 4 order of magnitude (i.e.
> > between 1000 and 10000 times) faster than any of these tricks.
[...]
> As it turns out, my CPU (AMD Phenom II X6 1055T) doesn't support the
> instruction. :-(
[...]

Actually, I was wrong, my CPU *does* have this instruction, but I needed
to pass the right -mcpu flag to LDC before it will emit it.  After
turning it on, I got a huge performance boost; it completely moved the
function that calls popcnt out of the hotspot into the background! Now
it's other functions that have become the bottleneck.  Cool stuff!


T

-- 
What do you mean the Internet isn't filled with subliminal messages? What about all those buttons marked "submit"??


More information about the Digitalmars-d mailing list