Operator overloading leads to bad code optimization

max haughton maxhaton at gmail.com
Sun Dec 5 21:38:55 UTC 2021


On Friday, 3 December 2021 at 21:24:07 UTC, claptrap wrote:
> Just a simple function to split a bezier in two.
>
> Using "-O3"
>
> LDC the operator version is 84 instructions
> LDC the hand expanded math is 49 instructions.
>
> It seems something as simple as this should be better 
> optimised? Or am I missing something?
>
> https://godbolt.org/z/4h9vob3Yo
>
> In fact there's quite a few bits where it looks like completely 
> redundant code is left in? Eg...
>
> 123 movss   dword ptr [rsp - 24], xmm1
> 124 movss   xmm0, dword ptr [rip + .LCPI4_0]
> 125 mulss   xmm1, xmm0
> 126 movss   dword ptr [rsp - 24], xmm1
>
>
> 137 movss   dword ptr [rsp - 24], xmm2
> 138 mulss   xmm2, xmm0
> 139 movss   dword ptr [rsp - 24], xmm2

This is (to me at least) an odd one. Maybe there's a 
pass-ordering issue here leading to bad code.

Seems like GCC does not have this issue.


More information about the digitalmars-d-ldc mailing list