[Issue 14943] dmd should inline more aggressively
via Digitalmars-d-bugs
digitalmars-d-bugs at puremagic.com
Thu Aug 20 22:19:57 PDT 2015
https://issues.dlang.org/show_bug.cgi?id=14943
--- Comment #1 from hsteoh at quickfur.ath.cx ---
Further notes:
- gdc not only inlines the call trees of .empty, .front, .popFront, it also
applied other loop optimizations like strength reduction to refactor the a =>
a*7 into a += b; b += 7. Not sure if dmd is capable of doing this, but in any
case the opportunity is missed because .popFront was not inlined, so the
optimizer wouldn't have been able to apply strength reduction.
- gdc's aggressive inlining also allowed various loop counters and accumulators
to be completely held in registers, while the function calls generated by dmd
necessitated dereferencing addresses to stack variables, which is an extra
layer of indirection. Again, a missed opportunity due to not inlining
aggressively enough.
For reference, here's the inner loop produced by gdc:
403b80: 89 d7 mov %edx,%edi
403b82: c1 ef 1f shr $0x1f,%edi
403b85: 8d 34 3a lea (%rdx,%rdi,1),%esi
403b88: 83 c2 07 add $0x7,%edx
403b8b: 83 e6 01 and $0x1,%esi
403b8e: 39 fe cmp %edi,%esi
403b90: 75 1e jne 403bb0 <int test.fun(int)+0x80>
403b92: 89 c6 mov %eax,%esi
403b94: 8d 14 cd 00 00 00 00 lea 0x0(,%rcx,8),%edx
403b9b: c1 ee 1f shr $0x1f,%esi
403b9e: 01 f0 add %esi,%eax
403ba0: 29 ca sub %ecx,%edx
403ba2: d1 f8 sar %eax
403ba4: 01 d0 add %edx,%eax
403ba6: 83 c2 07 add $0x7,%edx
403ba9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
403bb0: 83 c1 01 add $0x1,%ecx
403bb3: 39 cb cmp %ecx,%ebx
403bb5: 75 c9 jne 403b80 <int test.fun(int)+0x50>
--
More information about the Digitalmars-d-bugs
mailing list