What does 'inline' mean?

Tue Jun 9 14:21:09 UTC 2020

On Tue, Jun 9, 2020 at 7:30 PM Walter Bright via Digitalmars-d <
digitalmars-d at puremagic.com> wrote:

> On 6/8/2020 7:09 AM, Manu wrote:
> > On Mon, Jun 8, 2020 at 8:20 PM Walter Bright via Digitalmars-d
> > <digitalmars-d at puremagic.com <mailto:digitalmars-d at puremagic.com>>
> wrote:
> >
> >     On 6/7/2020 11:14 PM, Manu wrote:
> >      > I think a first part of the conversation to understand, is that
> since D
> >     doesn't
> >      > really have first-class `inline` (just a pragma, assumed to be
> low-level
> >      > compiler control), I think most people bring their conceptual
> definition
> >     over
> >      > from C/C++, and that definition is a little odd (although it is
> immensely
> >      > useful), but it's not like what D does.
> >
> >     C/C++ inline has always been a hint to the compiler, not a command.
> >
> >
> > It's not a hint at all. It's a mechanical tool; it marks symbols with
> internal
> > linkage, and it also doesn't emit them if it's never referenced.
> > The compiler may not choose to ignore that behaviour,
> The C/C++ inline semantics revolve around the mechanics of .h files
> because it
> doesn't have modules. These reasons are irrelevant for D.
>

That's completely incorrect. It's 100% as relevant for D as it is for C++
for exactly the same reasons.
You'll need to support your claim.

 > it's absolutely necessary, and very important.
>
> For .h files, sure. Why for D, though?
>

Because in this category of use case, inlining is a concept related to
native languages with binary linkage, and not really anything to do with
the language specifically.
Calling an inline function from a foreign module does not require that I
link that module's compiled code, because the inline function is emit
locally to the calling CU.
This is just as relevant in D as it is in C++, although the technical
manifestations are slightly different; in C++, a function defined in a
header emits a hard-copy of the function in each CU, and link errors
because multiple defined symbol. In D, you end up with no copy of the
function anywhere, and link errors because undefined symbol. `Inline`
addresses that same issue in both cases the same way.

There are a whole lot of reasons this comes up in binary ecosystems. Build
systems and dev tooling is a really complex topic, which tends to take
years and years of iteration for millions-loc software, and there are
frequently awkward requirements for various reasons. We need to fit in with
existing expectations.
It must be painless to plug into existing native code ecosystems; we have
invested heavily in extern C and C++, but compatibility with those
ecosystems also involves integrating into complex existing legacy build,
link, and distribution environments.

>     Why does it matter where it is emitted? Why would you want multiple
> copies of
> >     the same function in the binary?
> > I want zero copies if it's never called. That's very important.
>
> Why are 0 or N copies fine, but 1 is not?
>

No, that's not what I've said; I expect exactly N copies of x() for N CU's
which reference x(). And they should appropriately be marked with internal
linkage.
Any other result is just 'weird', and while it might be workable, it's just
asking for trouble. (1 redundant copy in the owning CU despite being
un-referenced MIGHT be link-stripped if the surrounding tooling all works
as we hope... but it also might not, as I have demonstrated on multiple
occasions)
There's just no reason to invite this problem... and no advantage.

> I also want copies to appear locally when it is referenced; inline
> functions
> > should NOT require that you link something to get the code... that's not
> inline
> > at all.
>
> Why? What problem are you solving?
>

Literally inline function calling. No-link libs are a really common and
extremely useful thing.

>>     Why? What is the problem with the emission of one copy where it was
> defined?
> > That's the antithesis of inline. If I wanted that, I wouldn't mark it
> inline.
> > I don't want a binary full of code that shouldn't be there. It's very
> important
> > to be able to control what code is in your binaries.
>
> I know I'm being boring, but why is it important? Also, check out
> -gc-sections:
>
> https://gcc.gnu.org/onlinedocs/gnat_ugn/Compilation-options.html
>
> Which is a general solution, not a kludge.
>

I guess my key issue is that I have complained before because I have
experienced multiple counts of the link stripping not working like you say.
There is no reason to invite that problem; we can trivially eliminate the
problem at the source. I don't care if it's fixable, I don't want the
problem to exist. NOBODY wants to be tasked to solve awkward build
ecosystem issues... we already have a working build ecosystem, and this D
thing is making it harder than it already is. That's a really bad thing,
and I would entertain excuses for this if not for the fact that it's
trivially avoidable. We do not need to waste anybodies time this way; we
won't win any friends by wasting their time with problems they HATE to have.

The secondary issue is, I want my binaries to contain what I put in there,
and not what I don't. Rogue symbols that I specified shouldn't exist only
bloat the binary, invite possibility of link collision, and raise the
probability of the issues I mentioned above.

>>     The PR I have on this makes it an informational warning. You can
> choose to be
> >>     notified if inlining fails.
> > That's not sufficient though for all use cases. This is a different kind
> of
> > inline (I think it's 'force inline').
>
> The default, and pragma(inline,true) are sufficient for all use cases
> except
> which ones?
>

Recursive calls or taking the address of functions (and probably other
situations) are incompatible with a hard-error based inline.
CU-inline is the overwhelmingly common case.
Absolutely-must-call-site-inline is extremely rare by contrast, but very
important in the rare instance it's necessary.

I suggest, the default should model the common case, and the rare niche
case can be the 3rd 'force' state by explicit request.

> This #3 mechanic is rare, and #1/2 are overwhelmingly common. You don't
> want a
> > sea of warnings to apply to cases of 1/2.
>
> You won't get a sea of warnings unless you put pragma(inline,true) on a
> sea of
> functions that can't be inlined.
>

There are classes of modules where EVERY function is inline. No-link libs
are a very common and useful tool.

> I think it's important to be able to distinguish #3 from the other 2
> cases.
>
> Why?
>

Because they have different use cases; the guarantees in #3 are critical
when required, and absolutely not desired in cases of #1-2.
The C++ design which only satisfies #1-2 leaves #3 in hope-for-the-best
territory, which you need to manually validate, and then have no way to
detect when the conditions or context changes.
If we only have one, it should model #1-2, but I think it's useful to model
#3 in addition; that would helpfully improve the reliability of code that
depends on #3.
And it's kinda what we tried to model 9 years ago, except we lost #1-2
along the way...

>>   At its root, inlining is an optimization, like deciding which
> variables go into
> >>   registers.
> > No, actually... it's not. It's not an 'optimisation' in any case except
> maaaaybe
> > #2; it's about control of the binary output and code generation.
>
> Inlining is 100% about optimization.
>

No. I feel like I couldn't make this case clearer... It's got almost
nothing to do with optimisation, it's about codegen control. In the very
rare event that I disagree with an optimisers heuristic, it's nice to have
an override hint, but that's like a 0.1% use case. Inline provides
mechanical control over codegen, and this is a native systems language. We
must have that control.
GDC/LDC have it, but it really should be specced to have uniform behaviour
across D compilers. We don't have macro's like C++ does to wrangle cases
where implementations differ as easily.

> Low level control of code generation is important in native languages;
> that's
> > why we're here.
>
> Optimizing things that don't matter is wasting your valuable time.
> Optimizing
> things that are more effectively and thoroughly done with the linker
> (-gc-sections) - it's like chipping wood with a hatchet rather than a
> woodchipper.
>

This isn't about optimisation, it's about controlling the output of the
compiler. Taking that control away and forcing us to try and reproduce that
expected functionality with external tooling within a deeply complex build
ecosystem is wasting our valuable time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20200610/e48e5fa7/attachment-0001.htm>