gdc or ldc for faster programs?
H. S. Teoh
hsteoh at quickfur.ath.cx
Tue Jan 25 22:33:37 UTC 2022
On Tue, Jan 25, 2022 at 01:30:59PM -0800, Ali Çehreli via Digitalmars-d-learn wrote:
[...]
> I posted the program to have more eyes on the assembly. ;)
[...]
I tested the code locally, and observed, just like Ali did, that the LDC
version is unambiguously slower than the gdc version by a small margin.
So I decided to compare the disassembly. Due to the large number of
templates in the main spellOut/spellOutImpl functions, I didn't have the
time to look at all of them; I just arbitrarily picked the !(int)
instantiation. And I'm seeing something truly fascinating:
- The GDC version has at its core a single idivl instruction for the /
and %= operators (I surmise that the optimizer realized that both
could share the same instruction because it yields both results). The
function is short and compact.
- The LDC version, however, seems to go out of its way to avoid the
idivl instruction, having instead a whole bunch of shr instructions
and imul instructions involving magic constants -- the kind of stuff
you see in bit-twiddling hacks when people try to ultra-optimize their
code. There also appears to be some loop unrolling, and the function
is markedly longer than the GDC version because of this.
This is very interesting because idivl is known to be one of the slower
instructions, but gdc nevertheless considered it not worthwhile to
replace it, whereas ldc seems obsessed about avoid idivl at all costs.
I didn't check the other instantiations, but it would appear that in
this case the simpler route of just using idivl won over the complexity
of trying to replace it with shr+mul.
T
--
Guns don't kill people. Bullets do.
More information about the Digitalmars-d-learn
mailing list