class templates and static if

Mon Feb 27 11:55:39 PST 2012

On 27.02.2012 23:18, Jonathan M Davis wrote:
> On Monday, February 27, 2012 10:55:12 Ali Çehreli wrote:
>> On 02/27/2012 08:29 AM, Tyler Jameson Little wrote:
>>> I didn't want to do subclassing, because my parser is a state-machine
>>
>> style
>>
>>> parser, so it's in a big switch. Pretty gross, but I would like it to
>>
>> be as
>>
>>> fast as possible. That's why I thought this model would be so cool,
>>
>> because
>>
>>> I could remove conditions from the generated code, and get rid of a
>>
>> lot of
>>
>>> the conditionals.
>>
>> I am pretty sure switch statements boil down to a sequence conditionals
>> consisting of equality comparisons.
>>
>> I know that some compilers use optimizations where the comparisons are
>> converted to a single lookup, but last I checked, dmd does not apply
>> that optimization. Perhaps because it's not implemented yet, or because
>> using a table lookup might be slower because of reaching outside of the
>> cpu cache. (Or another reason that I am not aware of. :))
>
> Yeah. In theory, switch statements can be optimized better than if-else
> chains, and eventually I'd fully expect that dmd would do that, but right now,
> I don't think that they really are. You'd have to look at the generated
> assembly though to be sure though.

Please stop spreading the misconception that they are going to be 
optimized "probably and eventually". Switch statements over integers are 
much better then if/else chains and were for something like 20 years. 
They do produce nice improvements over a sequence of if/else because 
values are known in advance and are bounded.
So it goes like this:
1. If all numbers are tightly packed (e.g. almost all of M..N range is 
covered by a case) then it does a jump table. So there is almost no 
comparisons at all (just check it fits the range) and it's just one 
indirect jump.

2.Even failing that it does a binary-search like comparisons to speed 
thing up.

3. Sometimes linear searching is faster then binary (on small number of 
case) and it can estimate this threshold. Same goes for table vs linear 
search and table vs binary search.

4. In fact it even does combination of all of the above, e.g. separate 
range in two by one comparison then use jump tables in both etc.

For instance:

switch(x){
	case 0:
	...
	case 1:
  	...
	case 4:
	...
	case 9:
	... //all cases used up to 40
	case 40:

It might do something like (depends on compiler vendor ;) ):

if( x < 9)
{
	if(x == 0)
		...

	else if(x == 1)
		...

	else if(x == 4)
		...
}
else {
	if(x <=40)
		goto table[x]; //pseudocode
	else
		goto default;
}

And for the record I seen all of that in assembly listings produced by 
dmd, when optimizing VM dispatch in std.regex. I had to revisit opcodes 
layout a bit so that dmd  outputs jump table and not a binary-searching 
like stuff in a tight loop.

-- 
Dmitry Olshansky