Standard way to supply hints to branches
Manu
turkeyman at gmail.com
Wed Sep 11 11:44:33 UTC 2024
On Wed, 11 Sept 2024 at 06:21, Walter Bright via Digitalmars-d <
digitalmars-d at puremagic.com> wrote:
> Compile the following with -vasm -O:
>
> ```
> void bar();
>
> int foo(int i)
> {
> if (i)
> return 0;
> bar();
> return 1;
> }
>
> int baz(int i)
> {
> if (i)
> goto Lreturn0;
> bar();
> return 1;
>
> Lreturn0:
> return 0;
> }
> ```
> and you get:
> ```
> _D5test93fooFiZi:
> 0000: 55 push RBP
> 0001: 48 8B EC mov RBP,RSP
> 0004: 85 FF test EDI,EDI
> 0006: 74 04 je Lc
> 0008: 31 C0 xor EAX,EAX // hot path
> 000a: 5D pop RBP
> 000b: C3 ret
> 000c: E8 00 00 00 00 call L0 // cold path
> 0011: B8 01 00 00 00 mov EAX,1
> 0016: 5D pop RBP
> 0017: C3 ret
> _D5test93bazFiZi:
> 0000: 55 push RBP
> 0001: 48 8B EC mov RBP,RSP
> 0004: 85 FF test EDI,EDI
> 0006: 75 0C jne L14
> 0008: E8 00 00 00 00 call L0 // hot path
> 000d: B8 01 00 00 00 mov EAX,1
> 0012: 5D pop RBP
> 0013: C3 ret
> 0014: 31 C0 xor EAX,EAX // cold path
> 0016: 5D pop RBP
> 0017: C3 ret
> ```
>
Okay, I see. You're depending on the optimiser to specifically collapse the
goto into the branch as a simplification.
Surely that's not even remotely reliable. There are several ways to
optimise that function, and I see no reason an optimiser would reliably
choose a construct like you show.
I'm actually a little surprised; a lifetime of experience with this sort of
thing might have lead me to predict that the optimiser would *actually*
shift the `return 0` up into the place of the goto, effectively eliminating
the goto... I'm sure I've seen optimisers do that transformation before,
but I can't recall ever noting an instance of code generation that looks
like what you pasted... I reckon I might have spotted that before.
... and turns out, I'm right. I was so surprised with the codegen you
present that I pulled out compiler explorer and ran some experiments.
I tested GCC and Clang for x86, MIPS, and PPC, all of which I am extremely
familiar with, and all of them optimise the way I predicted. None of them
showed a pattern like you presented here.
If I had to guess; I would actually imagine that GCC and Clang will very
deliberately NOT make a transformation like the one you show, for the
precise reason that such a transformation changes the nature of static
branch prediction which someone might have written code to rely on. It
would be dangerous for the optimiser to transform the code in the way you
show, and so it doesn't.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20240911/780267ac/attachment.htm>
More information about the Digitalmars-d
mailing list