Using Link Time Optimization (LTO)

Iain Buclaw ibuclaw at gdcproject.org
Sun Mar 23 03:32:28 PDT 2014


On 23 March 2014 07:49, Johannes Pfau <nospam at example.com> wrote:
> Am Sun, 23 Mar 2014 02:14:20 +0000
> schrieb "Mike" <none at none.com>:
>
>> Hello,
>>
>> I have some code generating the following assembly:
>> {OnReset}:
>>   8000010:       b508            push    {r3, lr}
>>   8000012:       20ff            movs    r0, #255        ; 0xff
>>   8000014:       f000 f828       bl      8000068 <{MyFunction}>
>>   8000018:       e7fe            b.n     8000018 <{OnReset}+0x8>
>>   800001a:       bf00            nop
>>
>> 08000068
>> {MyFunction}:
>>   8000068:       f44f 5380       mov.w   r3, #4096       ; 0x1000
>>   800006c:       f2c2 0300       movt    r3, #8192       ; 0x2000
>>   8000070:       7018            strb    r0, [r3, #0]
>>   8000072:       4770            bx      lr
>>
>> "MyFunction" and "OnReset" are in different source files and
>> therefore compiled to different object files.  I would like to
>> get "MyFunction" fully inlined to "OnReset" to remove the extra
>> branch instructions (bl and bx).
>>
>> It's my understanding that because the two functions are compiled
>> into separate object files, this must be done using LTO.  If I
>> compile them into the same object file, I get the full inlining
>> I'm looking for, but that's not going to scale well for my
>> project.
>>
>> ** Beautiful, isn't it? **
>> {OnReset}:
>>   8000010:       f44f 5380       mov.w   r3, #4096       ; 0x1000
>>   8000014:       f2c2 0300       movt    r3, #8192       ; 0x2000
>>   8000018:       22ff            movs    r2, #255        ; 0xff
>>   800001a:       701a            strb    r2, [r3, #0]
>>   800001c:       e7fe            b.n     800001c <{OnReset}+0xc>
>>   800001e:       bf00            nop
>>
>>
>> I've tried adding -flto to my compiler and linker flags and a
>> number of other things without success.  The compiler seems to
>> generate extra information in my object files, but the linker
>> doesn't seem to do the optimization.  I don't get any ICEs,
>> however, as stated in Bug 61 and 88.  I just don't get the result
>> I'm after.
>>
>> Here are my compiler commands:
>> arm-none-eabi-gdc -mthumb -mcpu=cortex-m4 -fno-emit-moduleinfo
>> -ffunction-sections -fdata-sections -O3 -c -flto ...
>> arm-none-eabi-ld -T link/link.ld -Map binary/memory.map
>> --gc-sections -flto ...
>>
>> I'm using my arm-none-eabi cross toolchain built from the GDC 4.8
>> branch.  I tried adding --enable-lto to my toolchain's configure,
>> but that had no effect.  It's my understanding that it's enabled
>> by default anyway.
>>
>> Does anyone know how I can get this level of inlining without
>> compiling all my source into one object file?
>>
>> Thanks for any help,
>> Mike
>
> Some time ago LTO was only supported by the gold linker, so you might
> need to configure binutils with --enable-gold --enable-plugins
> --enable-lto
>
> GCC should also be compiled with --enable-gold --enable-plugins
> --enable-lto
>
> http://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/Optimize-Options.html
> also says if you link manually you must use gcc to link, not ld and
> pass -flto when linking as well:
> gcc -o myprog -flto -O2 foo.o bar.o
>
> You can also try passing -fuse-linker-plugin to all gcc commands.
>
> I never used LTO though, so I'm not sure if this will actually help :-)

I'd rather we'd fix the outstanding LTO bug before we start testing with it. :o)


More information about the D.gnu mailing list