Standard way to supply hints to branches

Walter Bright newshound2 at digitalmars.com
Wed Sep 11 17:21:18 UTC 2024


On 9/11/2024 4:44 AM, Manu wrote:
> Okay, I see. You're depending on the optimiser to specifically collapse the goto 
> into the branch as a simplification.

Actually, the same code is generated without optimization. All it's doing is 
removing blocks that consist of nothing but "goto". It's a trivial optimization, 
and was there in the earliest version of the compiler.


> Surely that's not even remotely reliable. There are several ways to optimise 
> that function, and I see no reason an optimiser would reliably choose a 
> construct like you show.

gcc -O does more or less the same thing.


> I'm actually a little surprised; a lifetime of experience with this sort of 
> thing might have lead me to predict that the optimiser would /actually/ shift 
> the `return 0` up into the place of the goto, effectively eliminating the 
> goto... I'm sure I've seen optimisers do that transformation before, but I can't 
> recall ever noting an instance of code generation that looks like what you 
> pasted... I reckon I might have spotted that before.

The goto remains in the gcc -O version.


> ... and turns out, I'm right. I was so surprised with the codegen you present 
> that I pulled out compiler explorer and ran some experiments.
> I tested GCC and Clang for x86, MIPS, and PPC, all of which I am extremely 
> familiar with, and all of them optimise the way I predicted. None of them showed 
> a pattern like you presented here.

gcc -O produced:

```
foo:
     mov       EAX,0
     test      EDI,EDI
     jne       L1B
     sub       RSP,8
     call      bar at PC32
     mov       EAX,1
     add       RSP,8
L1B:    rep
     ret
baz:
     mov       EAX,0
     test      EDI,EDI
     jne       L38
     sub       RSP,8
     call      bar at PC32
     mov       EAX,1
     add       RSP,8
L38:    rep
     ret
```

> If I had to guess; I would actually imagine that GCC and Clang will very 
> deliberately NOT make a transformation like the one you show, for the precise 
> reason that such a transformation changes the nature of static branch prediction 
> which someone might have written code to rely on. It would be dangerous for the 
> optimiser to transform the code in the way you show, and so it doesn't.

The transformation is (intermediate code):
```
if (i) goto L2; else goto L4;
L2:
    goto L3;
L4:
    bar();
    return 1;
L3:
    return 0;
```
becomes:
```
if (!i) goto L3; else goto L4;
L4:
     bar();
     return 1;
L3:
     return 0;
```
I.e. the goto->goto was replaced with a single goto.

It's not dangerous or weird at all, nor does it interfere with branch prediction.



More information about the Digitalmars-d mailing list