Switch codegen.

claptrap clap at trap.com
Sun Aug 13 22:55:12 UTC 2023


On Sunday, 13 August 2023 at 20:52:18 UTC, Johan wrote:
> On Sunday, 13 August 2023 at 18:57:42 UTC, claptrap wrote:
>> On Sunday, 13 August 2023 at 13:13:30 UTC, Johan wrote:
>>> I'm quite certain that if `x` is constant, the bounds check 
>>> and table lookup will be hoisted out of the loop, indeed 
>>> leading to just an indirect jump inside the loop.
>>> Of course, this requires running the optimizer (-Ox, x>0).
>>
>> I've tried as many variations as I can figure and it always 
>> generates this...
>>
>> ```D
>>     cmp     r14d, 3
>>     ja      .LBB0_9
>>     movsxd  rax, dword ptr [r12 + 4*r15]
>>     add     rax, r12
>>     jmp     rax
>> ```
>>
>> https://d.godbolt.org/z/jzoevxoG4
>>
>> That's with O2, with O3 I cant make head nor tail of the code 
>> it generates, it's some pretty insane optimization.
>
> It's easier to look at LLVM IR in this case.
> https://d.godbolt.org/z/b6b3TTeMW
>
> 1)  By providing the implementation of `bar`, you are giving 
> the optimizer extra information. It may not inline `bar`, but 
> it _will_ make use of what it knows about the result of bar. 
> That's why at -O3, there is insane optimization going on. 
> Simply remove the implementation of `bar` for more insight 
> (make it a function declaration, not a definition).
>
> 2) Indeed at -O2, the optimizer deems it non-profitable or does 
> not see that it can hoist the load out of the for loop.  At -O3 
> it does hoist it out, like you wanted.
>
> cheers,
>   Johan

It hasnt hoisted the load out, it's transformed it from a loop 
containing a switch, to a switch containing 5 loops. The switch 
statement is never branched back to.

That does work for my case, probably the loop is too large, even 
with O3, the switch statement is still...

00007FF7C703075F  movsxd      rdi,dword ptr [rdx+rsi*4]
00007FF7C7030763  add         rdi,rdx
00007FF7C7030766  jmp         rdi

And it still checks the switch index inside the loop, it has 
moved it a bit earlier in the code. It seems it can tell that 
some of the code before the switch has no affect if the switch is 
bypassed.




More information about the digitalmars-d-ldc mailing list