Testing GDC (GCC 7.1) on Runtime-less ARM Cortex-M

Mike via D.gnu d.gnu at puremagic.com
Wed Jun 28 14:15:54 PDT 2017


On Wednesday, 28 June 2017 at 16:52:26 UTC, Iain Buclaw wrote:

> You probably want to tone down on optimizations as well.  -O3 
> will be doing a lot of work, sometimes for little or no gain.  
> In most cases, -O2 -finline-functions is good enough, which can 
> be abbreviated further as simply -Os.  [for full list of 
> enabled/disabled passes: gdc -Q -Os --help=optimizers]
>
> You can see a breakdown of what areas the compiler spends the 
> most time in with -ftime-report

Compiling with -O3
------------------
phase opt and generate  :  50.74 (96%) usr  21.24 (99%) sys  
72.94 (97%) wall 2426962 kB (94%) ggc
TOTAL                   :  53.02            21.49            
75.55            2589984 kB

real    1m21.339s
user    0m57.086s
sys     0m22.297s

arm-none-eabi-size binary/firmware
    text    data     bss     dec     hex filename
    6228       0  153600  159828   27054 binary/firmware


Compiling with -O2 -finline-functions
-------------------------------------
phase opt and generate  :  50.71 (96%) usr  20.58 (98%) sys  
72.04 (97%) wall 2381419 kB (94%) ggc
TOTAL                   :  52.89            20.93            
74.63            2544441 kB

real    1m20.755s
user    0m56.857s
sys     0m21.826s

arm-none-eabi-size binary/firmware
    text    data     bss     dec     hex filename
    5912       0  153600  159512   26f18 binary/firmware

Compiling with -O0
------------------
phase opt and generate  :  22.95 (91%) usr   5.42 (94%) sys  
28.38 (92%) wall 1777106 kB (92%) ggc
TOTAL                   :  25.14             5.74            
30.94            1940102 kB

real    0m36.476s
user    0m29.600s
sys     0m6.647s

arm-none-eabi-size binary/firmware
    text    data     bss     dec     hex filename
   45250       0  153600  198850   308c2 binary/firmware


-------------------------------------------------------------------------
The vast majority of time is spent in "phase opt and generate".  
A few observations:

* Elapsed time isn't much different between -O3 and -O2 
-finline-functions
* -O2 -finline-functions gave me a smaller binary :)
* -O0 reduced time significantly, but "phase opt and generate" 
still takes an awfully long time relative to everything else

What exactly is "phase opt and generate"?  I'm assuming "opt" 
means optimizer, but why is it taking such a long time even with 
-O0?  Maybe it's the "generate" part of that that's the most 
significant.

With -O0 there's still quite a few things enabled, so maybe I'll 
start appending a "-fno" to each one and see if I can find a 
culprit.

-O0 -Q --help=optimizers
   -faggressive-loop-optimizations       [enabled]
   -fauto-inc-dec                        [enabled]
   -fdce                                 [enabled]
   -fdelete-null-pointer-checks          [enabled]
   -fdse                                 [enabled]
   -fearly-inlining                      [enabled]
   -ffp-contract=[off|on|fast]           fast
   -ffp-int-builtin-inexact              [enabled]
   -ffunction-cse                        [enabled]
   -fgcse-lm                             [enabled]
   -finline                              [enabled]
   -finline-atomics                      [enabled]
   -fira-hoist-pressure                  [enabled]
   -fira-share-save-slots                [enabled]
   -fira-share-spill-slots               [enabled]
   -fivopts                              [enabled]
   -fjump-tables                         [enabled]
   -flifetime-dse                        [enabled]
   -fmath-errno                          [enabled]
   -fpeephole                            [enabled]
   -fplt                                 [enabled]
   -fprefetch-loop-arrays                [enabled]
   -fprintf-return-value                 [enabled]
   -freg-struct-return                   [enabled]
   -frename-registers                    [enabled]
   -frtti                                [enabled]
   -fsched-critical-path-heuristic       [enabled]
   -fsched-dep-count-heuristic           [enabled]
   -fsched-group-heuristic               [enabled]
   -fsched-interblock                    [enabled]
   -fsched-last-insn-heuristic           [enabled]
   -fsched-rank-heuristic                [enabled]
   -fsched-spec                          [enabled]
   -fsched-spec-insn-heuristic           [enabled]
   -fsched-stalled-insns-dep             [enabled]
   -fschedule-fusion                     [enabled]
   -fshort-enums                         [enabled]
   -fshrink-wrap-separate                [enabled]
   -fsigned-zeros                        [enabled]
   -fsimd-cost-model=[unlimited|dynamic|cheap]   unlimited
   -fsplit-ivs-in-unroller               [enabled]
   -fssa-backprop                        [enabled]
   -fstack-reuse=[all|named_vars|none]   all
   -fstdarg-opt                          [enabled]
   -fstrict-volatile-bitfields           [enabled]
   -fno-threadsafe-statics               [enabled]
   -ftrapping-math                       [enabled]
   -ftree-cselim                         [enabled]
   -ftree-forwprop                       [enabled]
   -ftree-loop-if-convert                [enabled]
   -ftree-loop-im                        [enabled]
   -ftree-loop-ivcanon                   [enabled]
   -ftree-loop-optimize                  [enabled]
   -ftree-phiprop                        [enabled]
   -ftree-reassoc                        [enabled]
   -ftree-scev-cprop                     [enabled]
   -fvar-tracking                        [enabled]
   -fvar-tracking-assignments            [enabled]
   -fweb                                 [enabled]


More information about the D.gnu mailing list