DIP56 Provide pragma to control function inlining

ponce contact at gam3sfrommars.fr
Mon Feb 24 09:39:39 PST 2014


On Monday, 24 February 2014 at 02:05:31 UTC, Walter Bright wrote:
> 1. It provides information to the compiler about runtime 
> frequency that it cannot obtain otherwise. This is very useful 
> information for generating better code.
>

> 2. Making it a hard requirement then means the user will have 
> to put versioning in it. It becomes inherently non-portable. 
> There is no way to predict what some other version of some 
> other compiler on some other system will do.
>
I'm not sure what it is impossible to inline in some case, I've 
never hit that limitation with ICC.
Like others I would like unconditional and explicit optimization 
from the compiler.

> 3. In the end, the compiler should make the decision. Inlining 
> does not always result in faster code, as I pointed out in 
> another post.

Also when I use "force inline" it's very often to force 
"not-inline" to reuse the same bit of code while the compiler 
would have inlined it.

Each optimization here is taken a repeatable automated A-B test 
with a 95% statistical significance on various inputs, and 
forcing inline/not-inline has been an effective tool to reduce 
the I-cache stress that plagues some very particular program 
areas that the compiler doesn't differentiate. This can be 
checked by looking at assembly or binary size afterwards.

I'm perfectly OK with the compiler doing what he wants when I 
don't tell it to inline or not. AFAIK the C/C++ inline keyword is 
mostly ignored by optimizing compilers, it's precisely a keyword 
that is both overused and meaningless.


> Perhaps the lesson is the word 'inline' carries certain 
> expectations with it, and the feature would be better 
> positioned as something like:
>
>     pragma(usage, often);
>     pragma(usage, rare);

To me it's not so much about usage frequency that about I-cache 
misses. Some inlining can be nearly free (I-cache working set 
small), or very costly (I-cache actively being the bottleneck 
through repeated miss due to large working set).


More information about the Digitalmars-d mailing list