By ref and by pointer kills performance.
claptrap
clap at trap.com
Tue Feb 13 02:11:45 UTC 2024
I was refactoring some code and changed a parameter from by
value, to by pointer, and saw the performance drop by 50%. This
is a highly reduced example of what I found, but basically
passing something into a function by reference or pointer seems
to make the compilers (it affects both DMD and LDC) treat it as
if its volatile and must be loaded from memory on every use. This
also inhibits the auto-vectorization of code by LDC.
https://d.godbolt.org/z/oonq1drd9
```d
void fillBP(uint* value, uint* dest)
{
dest[0] = *value;
dest[1] = *value;
dest[2] = *value;
dest[3] = *value;
}
```
codegen DMD -->
push RBP
mov RBP,RSP
mov ECX,[RSI]
mov [RDI],ECX
mov EDX,[RSI]
mov 4[RDI],EDX
mov R8D,[RSI]
mov 8[RDI],R8D
mov R9D,[RSI]
mov 0Ch[RDI],R9D
pop RBP
ret
codgen LDC -->
mov eax, dword ptr [rdi]
mov dword ptr [rsi], eax
mov eax, dword ptr [rdi]
mov dword ptr [rsi + 4], eax
mov eax, dword ptr [rdi]
mov dword ptr [rsi + 8], eax
mov eax, dword ptr [rdi]
mov dword ptr [rsi + 12], eax
ret
```d
void fillBV(uint value, uint* dest)
{
dest[0] = value;
dest[1] = value;
dest[2] = value;
dest[3] = value;
}
```
codgen DMD -->
push RBP
mov RBP,RSP
mov [RDI],ESI
mov 4[RDI],ESI
mov 8[RDI],ESI
mov 0Ch[RDI],ESI
pop RBP
ret
codegen LDC -->
movd xmm0, edi
pshufd xmm0, xmm0, 0
movdqu xmmword ptr [rsi], xmm0
ret
Interestingly if you do this...
```d
void fillBP(uint* value, uint* dest)
{
uint tmp = *value;
dest[0] = tmp;
dest[1] = tmp;
dest[2] = tmp;
dest[3] = tmp;
}
```
You get identical code to the by value versions. (except the load
from memory)
I'm not a compiler guy so maybe there's some rationale for this
that I don't know but it seems like the compiler should be able
to read "*value" once and cache it.
More information about the Digitalmars-d
mailing list