Testing GDC (GCC 7.1) on Runtime-less ARM Cortex-M
Mike via D.gnu
d.gnu at puremagic.com
Wed Jun 28 15:50:50 PDT 2017
On Wednesday, 28 June 2017 at 22:17:09 UTC, Iain Buclaw wrote:
> Phase opt and generate is the topl-evel timer for the entire
> "backend" compilation phase. I was expecting to see more of a
> breakdown of individual passes.
Sorry, it didn't look broken down to me. Here's the full report.
arm-none-eabi-gdc -c -O2 -finline-functions -nophoboslib
-nostdinc -nodefaultlibs -nostdlib -fno-emit-moduleinfo -mthumb
-mcpu=cortex-m4 -Isource/runtime -fno-bounds-check
-fno-invariants -fno-in -fno-out -ffunction-sections
-fdata-sections -ftime-report source/gcc/attribute.d
source/board/package.d source/board/ILI9341.d source/board/lcd.d
source/board/spi5.d source/board/statusLED.d
source/board/random.d source/board/ltdc.d source/stm32f42/bus.d
source/stm32f42/scb.d source/stm32f42/trace.d
source/stm32f42/dma2d.d source/stm32f42/spi.d
source/stm32f42/pwr.d source/stm32f42/rcc.d source/stm32f42/rng.d
source/stm32f42/nvic.d source/stm32f42/mmio.d
source/stm32f42/flash.d source/stm32f42/gpio.d
source/stm32f42/ltdc.d source/main.d -o binary/firmware.o
Execution times (seconds)
phase setup : 0.01 ( 0%) usr 0.00 ( 0%) sys
0.01 ( 0%) wall 2310 kB ( 0%) ggc
phase parsing : 2.21 ( 4%) usr 0.32 ( 2%) sys
2.55 ( 3%) wall 160684 kB ( 6%) ggc
phase opt and generate : 51.89 (96%) usr 20.13 (98%) sys
72.29 (97%) wall 2381419 kB (94%) ggc
phase last asm : 0.00 ( 0%) usr 0.00 ( 0%) sys
0.01 ( 0%) wall 26 kB ( 0%) ggc
phase finalize : 0.00 ( 0%) usr 0.00 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
garbage collection : 0.90 ( 2%) usr 0.04 ( 0%) sys
1.05 ( 1%) wall 0 kB ( 0%) ggc
dump files : 4.17 ( 8%) usr 1.96 (10%) sys
5.67 ( 8%) wall 0 kB ( 0%) ggc
callgraph construction : 0.66 ( 1%) usr 0.20 ( 1%) sys
1.07 ( 1%) wall 26036 kB ( 1%) ggc
callgraph optimization : 1.55 ( 3%) usr 0.78 ( 4%) sys
1.89 ( 3%) wall 1689 kB ( 0%) ggc
ipa dead code removal : 0.29 ( 1%) usr 0.00 ( 0%) sys
0.28 ( 0%) wall 0 kB ( 0%) ggc
ipa inheritance graph : 0.01 ( 0%) usr 0.00 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
ipa devirtualization : 0.02 ( 0%) usr 0.00 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
ipa cp : 0.21 ( 0%) usr 0.01 ( 0%) sys
0.18 ( 0%) wall 6160 kB ( 0%) ggc
ipa inlining heuristics : 0.69 ( 1%) usr 0.15 ( 1%) sys
0.67 ( 1%) wall 88573 kB ( 3%) ggc
ipa function splitting : 0.02 ( 0%) usr 0.00 ( 0%) sys
0.03 ( 0%) wall 9 kB ( 0%) ggc
ipa comdats : 0.05 ( 0%) usr 0.00 ( 0%) sys
0.05 ( 0%) wall 0 kB ( 0%) ggc
ipa various optimizations: 0.08 ( 0%) usr 0.00 ( 0%) sys
0.09 ( 0%) wall 0 kB ( 0%) ggc
ipa reference : 0.12 ( 0%) usr 0.00 ( 0%) sys
0.12 ( 0%) wall 0 kB ( 0%) ggc
ipa profile : 0.07 ( 0%) usr 0.00 ( 0%) sys
0.07 ( 0%) wall 0 kB ( 0%) ggc
ipa pure const : 0.38 ( 1%) usr 0.09 ( 0%) sys
0.54 ( 1%) wall 0 kB ( 0%) ggc
ipa icf : 1.59 ( 3%) usr 0.01 ( 0%) sys
1.60 ( 2%) wall 11 kB ( 0%) ggc
ipa SRA : 0.02 ( 0%) usr 0.00 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
ipa free lang data : 0.03 ( 0%) usr 0.00 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
ipa free inline summary : 0.03 ( 0%) usr 0.00 ( 0%) sys
0.04 ( 0%) wall 0 kB ( 0%) ggc
cfg construction : 0.15 ( 0%) usr 0.06 ( 0%) sys
0.12 ( 0%) wall 5 kB ( 0%) ggc
cfg cleanup : 0.66 ( 1%) usr 0.27 ( 1%) sys
1.04 ( 1%) wall 17 kB ( 0%) ggc
trivially dead code : 0.12 ( 0%) usr 0.05 ( 0%) sys
0.38 ( 1%) wall 0 kB ( 0%) ggc
df scan insns : 0.45 ( 1%) usr 0.19 ( 1%) sys
0.56 ( 1%) wall 5569 kB ( 0%) ggc
df multiple defs : 0.24 ( 0%) usr 0.06 ( 0%) sys
0.28 ( 0%) wall 0 kB ( 0%) ggc
df reaching defs : 0.15 ( 0%) usr 0.03 ( 0%) sys
0.26 ( 0%) wall 0 kB ( 0%) ggc
df live regs : 0.60 ( 1%) usr 0.25 ( 1%) sys
0.70 ( 1%) wall 0 kB ( 0%) ggc
df live&initialized regs: 0.32 ( 1%) usr 0.13 ( 1%) sys
0.57 ( 1%) wall 0 kB ( 0%) ggc
df use-def / def-use chains: 0.05 ( 0%) usr 0.03 ( 0%) sys
0.11 ( 0%) wall 0 kB ( 0%) ggc
df reg dead/unused notes: 0.56 ( 1%) usr 0.18 ( 1%) sys
0.86 ( 1%) wall 2562 kB ( 0%) ggc
register information : 0.14 ( 0%) usr 0.13 ( 1%) sys
0.40 ( 1%) wall 0 kB ( 0%) ggc
alias analysis : 0.79 ( 1%) usr 0.34 ( 2%) sys
1.14 ( 2%) wall 28569 kB ( 1%) ggc
alias stmt walking : 0.10 ( 0%) usr 0.02 ( 0%) sys
0.07 ( 0%) wall 0 kB ( 0%) ggc
register scan : 0.07 ( 0%) usr 0.01 ( 0%) sys
0.11 ( 0%) wall 106 kB ( 0%) ggc
rebuild jump labels : 0.05 ( 0%) usr 0.05 ( 0%) sys
0.15 ( 0%) wall 0 kB ( 0%) ggc
parser (global) : 2.19 ( 4%) usr 0.32 ( 2%) sys
2.51 ( 3%) wall 160144 kB ( 6%) ggc
early inlining heuristics: 0.17 ( 0%) usr 0.09 ( 0%) sys
0.24 ( 0%) wall 19510 kB ( 1%) ggc
inline parameters : 0.35 ( 1%) usr 0.18 ( 1%) sys
0.44 ( 1%) wall 58124 kB ( 2%) ggc
integration : 0.63 ( 1%) usr 0.24 ( 1%) sys
0.85 ( 1%) wall 80071 kB ( 3%) ggc
tree gimplify : 0.48 ( 1%) usr 0.17 ( 1%) sys
0.53 ( 1%) wall 109681 kB ( 4%) ggc
tree eh : 0.13 ( 0%) usr 0.07 ( 0%) sys
0.20 ( 0%) wall 13982 kB ( 1%) ggc
tree CFG construction : 0.19 ( 0%) usr 0.05 ( 0%) sys
0.17 ( 0%) wall 54230 kB ( 2%) ggc
tree CFG cleanup : 0.69 ( 1%) usr 0.38 ( 2%) sys
1.19 ( 2%) wall 1131 kB ( 0%) ggc
tree tail merge : 0.11 ( 0%) usr 0.02 ( 0%) sys
0.09 ( 0%) wall 0 kB ( 0%) ggc
tree VRP : 0.93 ( 2%) usr 0.35 ( 2%) sys
1.29 ( 2%) wall 89761 kB ( 4%) ggc
tree Early VRP : 0.21 ( 0%) usr 0.08 ( 0%) sys
0.31 ( 0%) wall 42204 kB ( 2%) ggc
tree copy propagation : 0.06 ( 0%) usr 0.03 ( 0%) sys
0.10 ( 0%) wall 0 kB ( 0%) ggc
tree PTA : 1.78 ( 3%) usr 0.85 ( 4%) sys
2.50 ( 3%) wall 4103 kB ( 0%) ggc
tree PHI insertion : 0.07 ( 0%) usr 0.02 ( 0%) sys
0.03 ( 0%) wall 6571 kB ( 0%) ggc
tree SSA rewrite : 0.16 ( 0%) usr 0.06 ( 0%) sys
0.20 ( 0%) wall 20087 kB ( 1%) ggc
tree SSA other : 0.21 ( 0%) usr 0.13 ( 1%) sys
0.51 ( 1%) wall 5602 kB ( 0%) ggc
tree SSA incremental : 0.15 ( 0%) usr 0.10 ( 0%) sys
0.30 ( 0%) wall 60 kB ( 0%) ggc
tree operand scan : 0.34 ( 1%) usr 0.22 ( 1%) sys
0.56 ( 1%) wall 56364 kB ( 2%) ggc
dominator optimization : 0.73 ( 1%) usr 0.22 ( 1%) sys
0.75 ( 1%) wall 7545 kB ( 0%) ggc
backwards jump threading: 0.30 ( 1%) usr 0.09 ( 0%) sys
0.25 ( 0%) wall 111 kB ( 0%) ggc
tree SRA : 0.13 ( 0%) usr 0.04 ( 0%) sys
0.17 ( 0%) wall 28 kB ( 0%) ggc
isolate eroneous paths : 0.04 ( 0%) usr 0.03 ( 0%) sys
0.09 ( 0%) wall 0 kB ( 0%) ggc
tree CCP : 0.68 ( 1%) usr 0.24 ( 1%) sys
0.85 ( 1%) wall 7302 kB ( 0%) ggc
tree PHI const/copy prop: 0.05 ( 0%) usr 0.02 ( 0%) sys
0.10 ( 0%) wall 0 kB ( 0%) ggc
tree split crit edges : 0.05 ( 0%) usr 0.06 ( 0%) sys
0.17 ( 0%) wall 19 kB ( 0%) ggc
tree reassociation : 0.23 ( 0%) usr 0.07 ( 0%) sys
0.38 ( 1%) wall 6 kB ( 0%) ggc
tree PRE : 1.28 ( 2%) usr 0.48 ( 2%) sys
1.78 ( 2%) wall 50466 kB ( 2%) ggc
tree FRE : 0.69 ( 1%) usr 0.36 ( 2%) sys
1.22 ( 2%) wall 17297 kB ( 1%) ggc
tree code sinking : 0.10 ( 0%) usr 0.05 ( 0%) sys
0.13 ( 0%) wall 6 kB ( 0%) ggc
tree linearize phis : 0.19 ( 0%) usr 0.08 ( 0%) sys
0.27 ( 0%) wall 41714 kB ( 2%) ggc
tree backward propagate : 0.02 ( 0%) usr 0.00 ( 0%) sys
0.07 ( 0%) wall 0 kB ( 0%) ggc
tree forward propagate : 0.23 ( 0%) usr 0.08 ( 0%) sys
0.38 ( 1%) wall 62 kB ( 0%) ggc
tree phiprop : 0.06 ( 0%) usr 0.01 ( 0%) sys
0.04 ( 0%) wall 0 kB ( 0%) ggc
tree conservative DCE : 0.21 ( 0%) usr 0.15 ( 1%) sys
0.36 ( 0%) wall 209 kB ( 0%) ggc
tree aggressive DCE : 0.28 ( 1%) usr 0.12 ( 1%) sys
0.44 ( 1%) wall 83438 kB ( 3%) ggc
tree buildin call DCE : 0.06 ( 0%) usr 0.00 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.09 ( 0%) usr 0.09 ( 0%) sys
0.21 ( 0%) wall 0 kB ( 0%) ggc
PHI merge : 0.07 ( 0%) usr 0.04 ( 0%) sys
0.11 ( 0%) wall 0 kB ( 0%) ggc
tree loop optimization : 0.02 ( 0%) usr 0.01 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
loopless fn : 0.04 ( 0%) usr 0.01 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
tree loop invariant motion: 0.01 ( 0%) usr 0.02 ( 0%) sys
0.07 ( 0%) wall 1 kB ( 0%) ggc
complete unrolling : 0.05 ( 0%) usr 0.04 ( 0%) sys
0.12 ( 0%) wall 136 kB ( 0%) ggc
tree iv optimization : 0.00 ( 0%) usr 0.00 ( 0%) sys
0.02 ( 0%) wall 120 kB ( 0%) ggc
tree copy headers : 0.03 ( 0%) usr 0.02 ( 0%) sys
0.03 ( 0%) wall 7 kB ( 0%) ggc
tree SSA uncprop : 0.28 ( 1%) usr 0.13 ( 1%) sys
0.31 ( 0%) wall 0 kB ( 0%) ggc
tree NRV optimization : 0.05 ( 0%) usr 0.00 ( 0%) sys
0.04 ( 0%) wall 849 kB ( 0%) ggc
tree switch conversion : 0.03 ( 0%) usr 0.00 ( 0%) sys
0.00 ( 0%) wall 0 kB ( 0%) ggc
tree strlen optimization: 0.03 ( 0%) usr 0.01 ( 0%) sys
0.09 ( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.09 ( 0%) usr 0.02 ( 0%) sys
0.04 ( 0%) wall 0 kB ( 0%) ggc
dominance computation : 1.42 ( 3%) usr 0.51 ( 2%) sys
1.94 ( 3%) wall 0 kB ( 0%) ggc
control dependences : 0.01 ( 0%) usr 0.00 ( 0%) sys
0.12 ( 0%) wall 0 kB ( 0%) ggc
out of ssa : 0.22 ( 0%) usr 0.10 ( 0%) sys
0.35 ( 0%) wall 7465 kB ( 0%) ggc
expand vars : 0.02 ( 0%) usr 0.02 ( 0%) sys
0.04 ( 0%) wall 506 kB ( 0%) ggc
expand : 0.63 ( 1%) usr 0.24 ( 1%) sys
1.12 ( 1%) wall 63840 kB ( 3%) ggc
post expand cleanups : 0.24 ( 0%) usr 0.04 ( 0%) sys
0.23 ( 0%) wall 18401 kB ( 1%) ggc
varconst : 0.01 ( 0%) usr 0.00 ( 0%) sys
0.04 ( 0%) wall 539 kB ( 0%) ggc
lower subreg : 0.07 ( 0%) usr 0.01 ( 0%) sys
0.05 ( 0%) wall 0 kB ( 0%) ggc
jump : 0.13 ( 0%) usr 0.00 ( 0%) sys
0.09 ( 0%) wall 0 kB ( 0%) ggc
forward prop : 0.73 ( 1%) usr 0.26 ( 1%) sys
0.86 ( 1%) wall 2110 kB ( 0%) ggc
CSE : 0.50 ( 1%) usr 0.19 ( 1%) sys
0.73 ( 1%) wall 1053 kB ( 0%) ggc
dead code elimination : 0.23 ( 0%) usr 0.07 ( 0%) sys
0.38 ( 1%) wall 0 kB ( 0%) ggc
dead store elim1 : 0.24 ( 0%) usr 0.09 ( 0%) sys
0.48 ( 1%) wall 1039 kB ( 0%) ggc
dead store elim2 : 0.27 ( 0%) usr 0.14 ( 1%) sys
0.39 ( 1%) wall 960 kB ( 0%) ggc
loop analysis : 0.10 ( 0%) usr 0.06 ( 0%) sys
0.11 ( 0%) wall 0 kB ( 0%) ggc
loop init : 1.34 ( 2%) usr 0.51 ( 2%) sys
1.93 ( 3%) wall 183463 kB ( 7%) ggc
loop invariant motion : 0.00 ( 0%) usr 0.00 ( 0%) sys
0.06 ( 0%) wall 1 kB ( 0%) ggc
loop doloop : 0.02 ( 0%) usr 0.00 ( 0%) sys
0.05 ( 0%) wall 36 kB ( 0%) ggc
loop fini : 0.61 ( 1%) usr 0.31 ( 2%) sys
0.94 ( 1%) wall 0 kB ( 0%) ggc
CPROP : 0.21 ( 0%) usr 0.05 ( 0%) sys
0.21 ( 0%) wall 295 kB ( 0%) ggc
PRE : 0.09 ( 0%) usr 0.01 ( 0%) sys
0.06 ( 0%) wall 4 kB ( 0%) ggc
auto inc dec : 0.12 ( 0%) usr 0.06 ( 0%) sys
0.15 ( 0%) wall 934 kB ( 0%) ggc
CSE 2 : 0.29 ( 1%) usr 0.18 ( 1%) sys
0.44 ( 1%) wall 171 kB ( 0%) ggc
branch prediction : 0.20 ( 0%) usr 0.06 ( 0%) sys
0.16 ( 0%) wall 4067 kB ( 0%) ggc
combiner : 0.84 ( 2%) usr 0.22 ( 1%) sys
1.42 ( 2%) wall 13624 kB ( 1%) ggc
if-conversion : 0.28 ( 1%) usr 0.08 ( 0%) sys
0.41 ( 1%) wall 2 kB ( 0%) ggc
scheduling : 1.45 ( 3%) usr 0.63 ( 3%) sys
2.15 ( 3%) wall 4177 kB ( 0%) ggc
integrated RA : 1.83 ( 3%) usr 0.70 ( 3%) sys
2.45 ( 3%) wall 964084 kB (38%) ggc
LRA non-specific : 0.69 ( 1%) usr 0.33 ( 2%) sys
0.90 ( 1%) wall 2272 kB ( 0%) ggc
LRA virtuals elimination: 0.27 ( 0%) usr 0.15 ( 1%) sys
0.36 ( 0%) wall 1881 kB ( 0%) ggc
LRA reload inheritance : 0.09 ( 0%) usr 0.04 ( 0%) sys
0.12 ( 0%) wall 0 kB ( 0%) ggc
LRA create live ranges : 0.12 ( 0%) usr 0.06 ( 0%) sys
0.12 ( 0%) wall 1 kB ( 0%) ggc
LRA hard reg assignment : 0.09 ( 0%) usr 0.05 ( 0%) sys
0.20 ( 0%) wall 0 kB ( 0%) ggc
reload : 0.12 ( 0%) usr 0.06 ( 0%) sys
0.13 ( 0%) wall 0 kB ( 0%) ggc
reload CSE regs : 0.46 ( 1%) usr 0.09 ( 0%) sys
0.43 ( 1%) wall 2852 kB ( 0%) ggc
thread pro- & epilogue : 0.25 ( 0%) usr 0.13 ( 1%) sys
0.45 ( 1%) wall 37093 kB ( 1%) ggc
if-conversion 2 : 0.06 ( 0%) usr 0.02 ( 0%) sys
0.18 ( 0%) wall 0 kB ( 0%) ggc
peephole 2 : 0.11 ( 0%) usr 0.04 ( 0%) sys
0.18 ( 0%) wall 11 kB ( 0%) ggc
hard reg cprop : 0.12 ( 0%) usr 0.05 ( 0%) sys
0.19 ( 0%) wall 0 kB ( 0%) ggc
scheduling 2 : 1.05 ( 2%) usr 0.44 ( 2%) sys
1.67 ( 2%) wall 3203 kB ( 0%) ggc
machine dep reorg : 0.21 ( 0%) usr 0.05 ( 0%) sys
0.26 ( 0%) wall 10319 kB ( 0%) ggc
reorder blocks : 0.10 ( 0%) usr 0.03 ( 0%) sys
0.20 ( 0%) wall 20 kB ( 0%) ggc
shorten branches : 0.16 ( 0%) usr 0.05 ( 0%) sys
0.07 ( 0%) wall 0 kB ( 0%) ggc
final : 0.88 ( 2%) usr 0.47 ( 2%) sys
1.51 ( 2%) wall 15600 kB ( 1%) ggc
variable output : 0.30 ( 1%) usr 0.03 ( 0%) sys
0.33 ( 0%) wall 10352 kB ( 0%) ggc
symout : 0.04 ( 0%) usr 0.00 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
tree if-combine : 0.05 ( 0%) usr 0.02 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
straight-line strength reduction: 0.13 ( 0%) usr 0.07 ( 0%)
sys 0.22 ( 0%) wall 0 kB ( 0%) ggc
store merging : 0.07 ( 0%) usr 0.03 ( 0%) sys
0.01 ( 0%) wall 9 kB ( 0%) ggc
address lowering : 0.01 ( 0%) usr 0.01 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
early local passes : 0.02 ( 0%) usr 0.00 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
unaccounted optimizations: 0.01 ( 0%) usr 0.01 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
rest of compilation : 5.83 (11%) usr 2.63 (13%) sys
8.49 (11%) wall 101391 kB ( 4%) ggc
unaccounted post reload : 0.01 ( 0%) usr 0.00 ( 0%) sys
0.01 ( 0%) wall 0 kB ( 0%) ggc
unaccounted late compilation: 0.03 ( 0%) usr 0.01 ( 0%) sys
0.02 ( 0%) wall 0 kB ( 0%) ggc
remove unused locals : 0.18 ( 0%) usr 0.08 ( 0%) sys
0.22 ( 0%) wall 0 kB ( 0%) ggc
address taken : 0.13 ( 0%) usr 0.03 ( 0%) sys
0.13 ( 0%) wall 0 kB ( 0%) ggc
rebuild frequencies : 0.03 ( 0%) usr 0.01 ( 0%) sys
0.03 ( 0%) wall 0 kB ( 0%) ggc
repair loop structures : 0.07 ( 0%) usr 0.03 ( 0%) sys
0.14 ( 0%) wall 0 kB ( 0%) ggc
TOTAL : 54.11 20.45 74.91
2544441 kB
> A thought just occurred to me, you are compiling the entire
> program + object.d right? Nothing else will link/be linked to
> the binary?
I'm passing all files except druntime files via the command line.
druntime files are imported via -Isource/runtime. But
essentially yes, I'm compiling the entire application in one
command so I can get cross-module inlining.
I tried moving all runtime files to the command line, but I get
errors about __entrypoint.
cc1d: error: module __entrypoint is in file '__entrypoint.d'
which cannot be read
Specify path to file '__entrypoint.d' with -I switch
> If that is the case, you should definitely compile with
> -fwhole-program. I suspect that may cut down your compilation
> time by half or even more.
If I only import __entrypoint.d and pass the rest of the runtime
files on the command line and compile with -fwhole-program, it
compiles in 5s, but I only get an 8byte binary. I suspect this
is due to the error above about __entrypoint. That is, if
there's no entry point, the whole program gets garbage collected.
I think you might be on to something here though.
I'm out of time now; gotta catch a plane soon. I'll try to do
more troubleshooting when I return.
Mike
More information about the D.gnu
mailing list