inlining...
John Colvin
john.loughran.colvin at gmail.com
Fri Mar 14 05:02:52 PDT 2014
On Friday, 14 March 2014 at 11:04:34 UTC, Manu wrote:
> On 14 March 2014 18:03, John Colvin
> <john.loughran.colvin at gmail.com> wrote:
>
>> As much as I like the idea:
>>
>> Something always tells me this is the compilers job... What
>> clever
>> reasoning are you applying that the compiler's inliner can't?
>> It seems like
>> a different situation to say SIMD code, where correctly
>> structuring loops
>> can require a lot of gymnastics that the compiler can't or
>> won't (floating
>> point conformance) do. The inlining decision seems easily
>> automatable in
>> comparison.
>>
>> I understand that unoptimised builds for debugging are a
>> problem, but a
>> sensible compiler let's you hand pick your optimisation passes.
>>
>> In short: why are compilers not good enough at this that the
>> programmer
>> needs to be involved?
>>
>
> The compiler applies generalised heuristics, which are
> certainly for the
> 'common' case, whatever that happens to be.
> The compiler simply doesn't know what you're doing, so it's
> very hard for
> the compiler to do anything really intelligent.
>
> Inlining heuristics are fickle, and they also don't know what
> you're
> actually trying to do.
> Is a function 'long'? How long is 'long'? Is the function
> 'hot'? Do we
> prefer code size or execution speed? Is the function called
> only from this
> location, or is it used in many locations? Etc.
> Inlining is one of the most fuzzy pieces of logic in the
> compiler, and
> relies on a lot of information that is impossible for the
> compiler to
> deduce, so it applies heuristics to try and do a decent job,
> but it's
> certainly not perfect.
>
> I argue, nothing so fickle can exist in the language without
> having a
> manual override. Especially not in a native language.
>
> In my current case, the functions I need to inline are not
> exactly trivial.
> They're really pushing the boundaries of the compilers inliner
> heuristics,
> and then I'm calling a series of such functions that operate on
> parallel
> data.
> If they don't inline, the performance equals the sum of the
> functions plus
> some overhead. If they all inline, the performance is equal to
> only the
> longest one, and no overhead (the others fill in pipeline gaps).
> Further, some of these functions embed some shared work... if
> they don't
> inline, this work is repeated. If they do inline, the redundant
> repeated
> work is eliminated.
>
> My experiments with std.algorithm were a failure. I realised
> quickly that I
> couldn't rely on the inliner to do a satisfactory job, and the
> optimiser
> was unable to do it's job properly.
> std.algorithm could really benefit from the mixin suggestion
> since things
> like predicate functions are always trivial, usually supplied
> as little
> lambdas, and inlining isn't reliable. Especially in the debug
> builds.
> Something like algorithm loop sugar shouldn't run heaps worse
> than an
> explicit loop just because it happens to be implemented by a
> generic
> function.
Thanks for the explanations.
Another use case is to aid propogation of compile-time
information for optimisation.
A function might look like a poor candidate for inlining for
other reasons, but if there's a statically known (to the caller)
integer parameter coming in that will be used to decide a loop
length, inlining allows that info to be propogated to the callee.
Static loop lengths => well optimised loops, with opportunities
for optimal unrolling. Even with quite a large function this can
be a good choice to inline.
I don't know how good compilers are at taking this sort of thing
into account already.
More information about the Digitalmars-d
mailing list