DIP56 Provide pragma to control function inlining

Dmitry Olshansky dmitry.olsh at gmail.com
Mon Feb 24 00:25:42 PST 2014


24-Feb-2014 04:33, Walter Bright пишет:
> On 2/23/2014 3:55 PM, Mike wrote:
>> The difference is it was explicitly told do do something and didn't.
>> That's
>> insubordination.
>
> I view this as more in the manner of providing the equivalent of runtime
> profiling information to the optimizer, in indirectly saying how often a
> function is executed.
>
> Optimizing is a rather complicated process, and particular optimizations
> very often have weird and unpredictable interactions with other
> optimizations.

Speaking of other optimizations.

There is a thing called tail-call. Funnily enough compilers still 
consider it an optimization whereas in practice the difference usually 
means "stack overflow" vs "normal execution" for functional-style code. 
But I'd rather prefer we stay focused on one particular optimization here.

> For example, in the olden days, C compilers had a 'register' storage
> class. Optimizers' register allocation strategy was so primitive it
> needed help. Over time, however, it became apparent that uses of
> 'register' became bit-rotted due to maintenance, resulting in all the
> wrong variables being enregistered. Compiler register allocation got a
> lot better, almost always being better than the users'.

When such a time the compiler can actually produce the best inlining 
decisions on its own these kind of options may become irrelevant.
However it may need to run profiler on relevant input to understand that 
and do it all by itself.

> Not only that,
> but with generic code, and optimization rewrites of code, many variables
> would disappear and new ones would take their place. Different CPUs
> needed different register allocation strategies. What to do with
> 'register' then?

Indeed register was tied to something immaterial - a variable, whereas 
in fact there are plenty of temporaries and induction variables that a 
programmer can't label.

In contrast the generic code is functions upon functions passed through 
other tiny functions. This in part what makes inlining so special.

> The result was compilers began to take the 'register' as a hint, and
> eventually moved to totally ignoring 'register', as it turned out to be
> a pessimization.
>
> I suspect that elevating one particular optimization hint to being an
> absolute command may not turn out well. Inlining already has performance
> issues, as it may increase the size of an inner loop beyond what will
> fit in the cache, for just one unexpected result. For another it may
> mess up the register allocation of the caller.

>"Inlining makes it
> faster" is not always true.

Like I'm a bloody idiot. But once your performance problem is (after 
perusing ASM) particular function not being inlined, dancing around 
compiler in the DARK until it strikes home (if ever) isn't a viable option.

And with DMD it's like 90% of cases my problem is some critical 
one-liner not being inlined. In contracts register allocation is mostly 
fine.
There are some marvelous codegen gems though:
https://d.puremagic.com/issues/show_bug.cgi?id=10932
where compiler moves from ebx to edx via a stack slot for no apparent 
reason.

> Do you really want to weld this in as an
> absolute requirement in the language?

Aye. That and explicit tail calls but that's a separate matter.
Experimental compilers may choose to issue warnings saying that they 
basically can't inline (yet or by design).

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list