Fast 2D matrix of bits

Denis Shelomovskij verylonglogin.reg at gmail.com
Tue Sep 20 02:00:44 PDT 2011


20.09.2011 10:22, bearophile пишет:
> Josh Simmons:
>
>> You could also just...
>>
>> uint32_t next_pow_2(uint32_t n)
>> {
>>      n -= 1;
>>      n |= n>>  1;
>>      n |= n>>  2;
>>      n |= n>>  4;
>>      n |= n>>  8;
>>      n |= n>>  16;
>>      return n + 1;
>> }
>
> My version with bsr is faster.
>
> Bye,
> bearophile

Is it? Every conditional branch can break CPU command queue and slow 
down execution. Anyway, I saw same problem not far-away (2^n sized 
OpenGL textures). IMHO (too lazy to make tests) it is the fastest:

---
bool isPow2(int n)
in { assert(n > 0); }
body { return !((n - 1) & n); }

unittest {
	assert(!isPow2(0));
	assert(isPow2(1));
	assert(isPow2(2));
	assert(!isPow2(3));
	assert(isPow2(4));
	assert(!isPow2(5));
}

int toPow2(int n)
in { assert(n > 0); }
body {
	return 1 << (bsr(n) + !isPow2(n));
}

unittest {
	assert(toPow2(1) == 1);
	assert(toPow2(2) == 2);
	assert(toPow2(3) == 4);
	assert(toPow2(4) == 4);
	assert(toPow2(5) == 8);
}
---


A short version (AFAIK, !! is one lexem for dmd):
---
int toPow2(int n) {
	return 1 << (bsr(n) + !!((n - 1) & n));
}
---


More information about the Digitalmars-d mailing list