Efficient idiom for fastest code

IntegratedDimensions IntegratedDimensions at gmail.com
Wed May 23 03:12:52 UTC 2018


On Wednesday, 23 May 2018 at 03:00:17 UTC, Nicholas Wilson wrote:
> On Wednesday, 23 May 2018 at 02:24:08 UTC, IntegratedDimensions 
> wrote:
>>
>> Many times in expensive loops one must make decisions. 
>> Decisions must be determined and the determination costs.
>>
>> for(int i = 0; i < N; i++)
>> {
>>     if(decision(i)) A; else B;
>> }
>>
>> the if statement costs N times the cycle cost.
>>
>> In some cases the decision holds for continuous ranges. For 
>> some 0 <= n <= N the decision is constant, but n is 
>> arbitrary(determined by unknown factors at compile time).
>>
>> One can speed up the routine by using something akin to a 
>> simplified strategy pattern where one uses 
>> functions/delegates/lambdas to code a faster version without 
>> the test:
>>
>>
>> for(int i = 0; i < N; i++)
>> {
>>     d = (){ if(decision(i)) A; else d = () { B; };
>>     d();
>> }
>>
>
> assuming you meant
>> auto d = (int i){ if(decision(i)) A; else d = (int i) { B; };};
>> for(int i = 0; i < N; i++)
>> {
>>     d(i);
>> }
>>
>
>> this code basically reduces to
>>
>> for(int i = 0; i < N; i++)
>> {
>>     B;
>> }
>>
>> Once the decision fails, which we assume once it fails it 
>> always fails in this particular case.
>>
>> Therefor, once the decision fails it kicks in the "faster" 
>> code. Suppose decision is very slow.
>>
>> The cycle cost is then n times rather than N times.
>>
>> In general, such techniques can be used to speed up code, 
>> sometimes significantly, but are difficult to implement using 
>> a standard pattern and for more complex decision functions.
>>
>>
>> Does anyone see any way to use some some of D's power to help 
>> simplify such things but not using something like a strategy 
>> pattern or complexifying the overall design(using advanced 
>> techniques such as templates, mixins is not making the code 
>> more complex).
>
> If decision is pure, consider using static foreach 
> (iota(N).map!decision) { ... }to unroll it at compile time. 
> even if it isn't the compiler may still be able to factor out 
> parts of repeated calls to decision, look at the assembly/IR to 
> confirm this.
>
> Otherwise PROFILE!!!!! (and use profile guided optimisation! 
> this implies using a compiler other than DMD which if you care 
> about performance you should be doing anyway)
>
> Blindly trying to optimise is just as likely to hurt 
> performance.
> in particular don't underestimate the branch predictor. Even if 
> decision is complex, if there is a pattern in evaluating 
> iota(n).map!decision the branch predictor will pick up on it.
>
> In terms of exploiting knowledge of decision a priori that the 
> compiler somehow lacks you really don't have much option but to 
> do it yourself.

I knew someone was going to say that and I forgot to say DON'T!

Saying to profile when I clearly said these ARE cases where they 
are slow is just moronic. Please don't use default answers to 
arguments.

This was a general question about cases on how to attack a 
problem WHEN profiling says I need to optimize. Your SO 101 
answer sucks! Sorry!

To prove to you that your answer is invalid:

I profile my code, it says that it is very slow and shows that it 
is do to the decision checking... I then I have to come here and 
write up a post trying to explain how to solve the problem. I 
then get a post telling me I should profile. I then respond I did 
profile and that this is my problem. A lot of wasted energy when 
it is better to know a general attack strategy. Yes, some of us 
can judge if code is needed to be optimized before profiling. It 
is not difficult. Giving a generic answer that always does not 
apply and is obvious to anyone trying to do optimization is not 
helpful. Everyone today pretty must does not even optimize code 
anymore... this isn't 1979. It's not ok to keep repeating the 
same mantra. I guess we should turn this in to a meme?

The reason I'm getting on to you is that the "profile before 
optimization" sounds a bit grade school, specially since I wasn't 
talking anything about profiling but a general programming 
pattern speed up code, which is always valid but not always 
useful(and hence that is when profiling comes in).



More information about the Digitalmars-d-learn mailing list