Compiler noob here: auto a = popcnt(bitset); auto b = bsf(bitset); generate this: call pure nothrow @nogc @safe int core.bitop.popcnt(uint)@PLT call pure nothrow @nogc @safe int core.bitop.bsf(uint)@PLT Why not generate the bsf/popcnt instruction? Aren't this call's slower? (this question expand to all the places where calls to @PLT happen)