Switch codegen.
claptrap
clap at trap.com
Sun Aug 13 22:55:12 UTC 2023
On Sunday, 13 August 2023 at 20:52:18 UTC, Johan wrote:
> On Sunday, 13 August 2023 at 18:57:42 UTC, claptrap wrote:
>> On Sunday, 13 August 2023 at 13:13:30 UTC, Johan wrote:
>>> I'm quite certain that if `x` is constant, the bounds check
>>> and table lookup will be hoisted out of the loop, indeed
>>> leading to just an indirect jump inside the loop.
>>> Of course, this requires running the optimizer (-Ox, x>0).
>>
>> I've tried as many variations as I can figure and it always
>> generates this...
>>
>> ```D
>> cmp r14d, 3
>> ja .LBB0_9
>> movsxd rax, dword ptr [r12 + 4*r15]
>> add rax, r12
>> jmp rax
>> ```
>>
>> https://d.godbolt.org/z/jzoevxoG4
>>
>> That's with O2, with O3 I cant make head nor tail of the code
>> it generates, it's some pretty insane optimization.
>
> It's easier to look at LLVM IR in this case.
> https://d.godbolt.org/z/b6b3TTeMW
>
> 1) By providing the implementation of `bar`, you are giving
> the optimizer extra information. It may not inline `bar`, but
> it _will_ make use of what it knows about the result of bar.
> That's why at -O3, there is insane optimization going on.
> Simply remove the implementation of `bar` for more insight
> (make it a function declaration, not a definition).
>
> 2) Indeed at -O2, the optimizer deems it non-profitable or does
> not see that it can hoist the load out of the for loop. At -O3
> it does hoist it out, like you wanted.
>
> cheers,
> Johan
It hasnt hoisted the load out, it's transformed it from a loop
containing a switch, to a switch containing 5 loops. The switch
statement is never branched back to.
That does work for my case, probably the loop is too large, even
with O3, the switch statement is still...
00007FF7C703075F movsxd rdi,dword ptr [rdx+rsi*4]
00007FF7C7030763 add rdi,rdx
00007FF7C7030766 jmp rdi
And it still checks the switch index inside the loop, it has
moved it a bit earlier in the code. It seems it can tell that
some of the code before the switch has no affect if the switch is
bypassed.
More information about the digitalmars-d-ldc
mailing list