[Issue 16489] [backend][optimizaton][registers] DMD is 10-20 times slower for GLAS

Tue Sep 27 12:29:53 PDT 2016

https://issues.dlang.org/show_bug.cgi?id=16489

--- Comment #3 from Walter Bright <bugzilla at digitalmars.com> ---
Ok, I understand. This is the 'slicing' optimization where an aggregate can be
sliced up and stored in multiple registers. I went over it with deadalnix a
while ago, as it was identified as a key optimization. It applies more
generally than just for SIMD.

I also worked out a scheme for implementing it in the DMD BE, I don't think it
is that hard, or I've misunderstood it. The slicing can be done if:

1. all accesses lie within slices (not across slice boundaries)
2. a pointer to the aggregate is not taken (because then you lose control of
(case 1)).

The slicing then becomes a rewrite of the IR so the aggregate is decomposed
into multiple independent variables, and the rest of the backend then proceeds
normally.

--