LDC: Constant Folding Across Nested Functions?
dsimcha
dsimcha at yahoo.com
Sat May 18 10:39:20 PDT 2013
Background: This came from an attempt to get rid of delegate
indirection in parallel foreach loops on LDC. LDC can inline
delegates that always point to the same code. This means that it
can inline opApply delegates after inlining opApply itself and
effectively constant folding the delegate.
Simplified case without unnecessarily complex context:
// Assume this function does NOT get inlined.
// In my real use case it's doing something
// much more complicated and in fact does not
// get inlined.
void runDelegate(scope void delegate() dg) {
dg();
}
// Assume this function gets inlined into main().
uint divSum(uint divisor) {
uint result = 0;
// If divisor gets const folded and is a power of 2 then
// the compiler can optimize the division to a shift.
void doComputation() {
foreach(i; 0U..1_000_000U) {
result += i / divisor;
}
}
runDelegate(&doComputation);
}
void main() {
// divSum gets inlined, to here, but doComputation()
// can't because it's called through a delegate.
// Therefore, the 2 is never const folded into
// doComputation().
auto ans = divSum(2);
}
The issue I'm dealing with in std.parallelism is conceptually the
same as this, but with much more context that's irrelevant to
this discussion. Would the following be a feasible compiler
optimization either in the near future or at least in principle:
When an outer function is inlined, all non-static inner functions
should be recompiled with the information gained by inlining the
outer function. In this case doComputation() would be recompiled
with divisor const-folded to 2 and the division optimized to a
shift. This post-inlining compilation would then be passed to
runDelegate().
Also, is there any trick I'm not aware of to work around the
standard compilation model and force this behavior now?
More information about the Digitalmars-d
mailing list