core.bitop.bt not faster than & ?
Adam D. Ruppe via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Wed Dec 17 06:58:12 PST 2014
On Wednesday, 17 December 2014 at 14:12:16 UTC, Trollgeir wrote:
> I'd expect the bt function to be up to 32 times faster as I
> thought it only compared two bits, and not the entire length of
> bits in the uint.
The processor doesn't work in terms of bits like that - it still
needs to look at the whole integer. In fact, according to my
(OLD) asm reference, the bt instruction is slower than the and
instruction at the cpu level.
I think it has to do a wee bit more work, translating the 16 into
a mask then moving the result into the flag... then moving the
flag back into a register to return the value. (That last step
could probably be skipped if you do an if() on it and the
compiler optimizes the branch, and the first step might be
skipped too if it is a constant, since the compiler can rewrite
the instruction. So given that, I'd expect what you saw: no
difference when they are optimized to the same thing or when the
CPU's stars align right, and & a bit faster when bt isn't
optimized)
bt() and friends are special instructions for specialized use
cases. Probably useful for threading and stuff.
More information about the Digitalmars-d-learn
mailing list