Using Link Time Optimization (LTO)
Iain Buclaw
ibuclaw at gdcproject.org
Sun Mar 23 03:32:28 PDT 2014
On 23 March 2014 07:49, Johannes Pfau <nospam at example.com> wrote:
> Am Sun, 23 Mar 2014 02:14:20 +0000
> schrieb "Mike" <none at none.com>:
>
>> Hello,
>>
>> I have some code generating the following assembly:
>> {OnReset}:
>> 8000010: b508 push {r3, lr}
>> 8000012: 20ff movs r0, #255 ; 0xff
>> 8000014: f000 f828 bl 8000068 <{MyFunction}>
>> 8000018: e7fe b.n 8000018 <{OnReset}+0x8>
>> 800001a: bf00 nop
>>
>> 08000068
>> {MyFunction}:
>> 8000068: f44f 5380 mov.w r3, #4096 ; 0x1000
>> 800006c: f2c2 0300 movt r3, #8192 ; 0x2000
>> 8000070: 7018 strb r0, [r3, #0]
>> 8000072: 4770 bx lr
>>
>> "MyFunction" and "OnReset" are in different source files and
>> therefore compiled to different object files. I would like to
>> get "MyFunction" fully inlined to "OnReset" to remove the extra
>> branch instructions (bl and bx).
>>
>> It's my understanding that because the two functions are compiled
>> into separate object files, this must be done using LTO. If I
>> compile them into the same object file, I get the full inlining
>> I'm looking for, but that's not going to scale well for my
>> project.
>>
>> ** Beautiful, isn't it? **
>> {OnReset}:
>> 8000010: f44f 5380 mov.w r3, #4096 ; 0x1000
>> 8000014: f2c2 0300 movt r3, #8192 ; 0x2000
>> 8000018: 22ff movs r2, #255 ; 0xff
>> 800001a: 701a strb r2, [r3, #0]
>> 800001c: e7fe b.n 800001c <{OnReset}+0xc>
>> 800001e: bf00 nop
>>
>>
>> I've tried adding -flto to my compiler and linker flags and a
>> number of other things without success. The compiler seems to
>> generate extra information in my object files, but the linker
>> doesn't seem to do the optimization. I don't get any ICEs,
>> however, as stated in Bug 61 and 88. I just don't get the result
>> I'm after.
>>
>> Here are my compiler commands:
>> arm-none-eabi-gdc -mthumb -mcpu=cortex-m4 -fno-emit-moduleinfo
>> -ffunction-sections -fdata-sections -O3 -c -flto ...
>> arm-none-eabi-ld -T link/link.ld -Map binary/memory.map
>> --gc-sections -flto ...
>>
>> I'm using my arm-none-eabi cross toolchain built from the GDC 4.8
>> branch. I tried adding --enable-lto to my toolchain's configure,
>> but that had no effect. It's my understanding that it's enabled
>> by default anyway.
>>
>> Does anyone know how I can get this level of inlining without
>> compiling all my source into one object file?
>>
>> Thanks for any help,
>> Mike
>
> Some time ago LTO was only supported by the gold linker, so you might
> need to configure binutils with --enable-gold --enable-plugins
> --enable-lto
>
> GCC should also be compiled with --enable-gold --enable-plugins
> --enable-lto
>
> http://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/Optimize-Options.html
> also says if you link manually you must use gcc to link, not ld and
> pass -flto when linking as well:
> gcc -o myprog -flto -O2 foo.o bar.o
>
> You can also try passing -fuse-linker-plugin to all gcc commands.
>
> I never used LTO though, so I'm not sure if this will actually help :-)
I'd rather we'd fix the outstanding LTO bug before we start testing with it. :o)
More information about the D.gnu
mailing list