How to get to a class initializer through introspection?

Thu Aug 6 09:41:01 UTC 2020

Am Wed, 05 Aug 2020 22:19:11 +0000 schrieb Johan:

>> But initializer symbols are currently not in COMDAT, or does LDC
>> implement that? That's a crucial point, as it addresses Andrei's
>> initializer bloat point. And it also means you can avoid emitting the
>> symbol if it's never referenced. But if it is referenced, it will be
>> available.
> 
> It does not matter whether the initializer symbol is in COMDAT, because
> (currently) it has to be dynamically accessible (e.g. by a user of a
> compiled library or e.g. by druntime GC object destroy code) and thus
> cannot be determined whether it is referenced at link/compile time.

You're right, I forgot for a second that right now, the initializer 
symbol has to be accessible. So obviously making it comdat now is not 
possible, however I think Andrei wanted to make most of that optional 
with the TypeInfo changes.

Regarding "e.g. by a user of a compiled library": That is exactly my 
point when I said the initializer _expression_ must always be available 
to the compiler, even for such precompiled libraries. And whenever an 
initializer is accessed in some code unit, the symbol should be generated 
and put into comdat.

This way, there can be exactly 0 or 1 instances of the initializer 
symbol, pay-as-you-go depending on whether it's used.

> 
>> Initializer functions have the drawback that backends can no longer
>> choose different strategies for -Os or -O2. All the other benefits you
>> mention (=void holes, padding schenanigans, or non-zero-but-repetitive-
>> constant double[1million] arrays, ...) can also be handled properly by
>> the backend in the initializer-symbol case if the initializer
>> expression is available to the backend. And you have to ensure that the
>> initialization function can always be inlined, so without -O flags it
>> may also lead to suboptimal code...
> 
> Backends can also turn an initializer function into a memcpy function.

Yes but as there's no symbol with a global name, the compiler has to 
somehow place the data locally (local symbol / in code). Inline your code 
into two code units and you have unnecessarily duplicated initializer 
data.

Interestingly, I can't even get GCC to convert an initilizer function into 
a symbol: https://godbolt.org/z/b61fcs
There's the same problem for inlining though, this will lead to lots of 
duplication bloat. So when using initializer functions, inlining should 
probably be not enforced and there needs to be a global function symbol as 
a fallback. OTOH we want the inliner to be able to actually inline 
initializer functions in any case...

> It's perfectly fine if code is suboptimal without -O.
> You can simply express more with a function than with a symbol (a symbol
> implies the function "memcpy(all)", whereas a function could do that and
> more).

That's why I'm not talking about only a symbol, I'm talking about the 
symbol backed by an initializer expression. The initializer expression 
(StructInitializer / ExpInitializer) is essentially the code 
representation of the initializer, as complex / compact as it may be. But 
the symbol fallback (SymbolExp?) can be useful in some cases.

> How would you express =void using a symbol in an object file?

Obviously there has to be some data there, 0, random, whatever. But again, 
I don't want to have the symbols, I only want to have them as a fallback 
when needed:

Maybe I don't really understand the problem: Consider this code:
https://explore.dgnu.org/z/_yixUX
----------
struct Large
{
    ubyte a = 42;
    size_t[64] blob = void;
    ubyte b = 10;
}

void foo()
{
    Large l;   
}
----------

Because of the byte-by-byte struct comparison, the blob memory actually 
has to be initialized to 0. Nevertheless, you can see that the backend 
does not reference the symbol at -O0 and it explicitly does:
mov     BYTE PTR [rbp-528], 42
So it does not only see "the symbol", it does see the individual field 
initializers. If byte-by-byte comparison wasn't a requirement, the 
backend (GCC) would perfectly only initialize a and b.

Now move struct Large into a different file: You'll see that GCC now 
"only sees the symbol", so copies from "_D1s5Large6__initZ".

I see two problems with this:
* We do not get the symbol-less initializer form if using multiple-files. 
  That's why I think the frontend should make the initializer expression 
(StructInitializer) 
  which provides expressions to initialize all fields even for aggregates 
  in non-root modules.
* We always emit the initializer symbol and pay for the overhead ==> 
  comdat.

Apart from that, there is also a GDC "bug" which seems to always emit the 
symbol-less initializer, if possible. It would be preferable to let the 
backend (GCC) choose which one to use and according to some tests in C++ 
experiments, that is be possible. But it probably needs -O to choose the 
best solution.

> 
>> If the initializer optimizations depend on -O flags, it should also be
>> possible to move the necessary steps in the backend into a different
>> step which is executed even without optimization flags. Choosing to
>> initialize using expressions vs. a symbol should not be an expensive
>> step.
> 
> Actually, this does sound like an expensive analysis to me (e.g.
> detecting the case of a large array with repetitive initialization
> inside a struct with a few other members). But maybe more practically,
> is it possible to enable/disable specific optimization passes for
> individual functions with gcc backend at -O0? (we can't with LLVM)

Of course it depends on how far you go. Simply checking how much actual 
initialization data there is vs. =void and alignement holes is simple.
Detecting foo [1, 2, 3, 1, 2, 3, 1, 2, 3] would be quite difficult. But 
how is that different when done in the frontend?

However, I'm not arguing at all that we should just pass a flat data 
buffer to the glue code and let the glue code figure out how to 
reconstruct initialization code from that. I'm suggesting that we always 
pass both, the comdat symbol and the initialization expression, to the 
backend:

For GCC, we can simply pass any expression (I'm not sure if it has to be 
constant, i.e. computable at compile time) in the GCC GENERIC backend 
language to DECL_INITIAL for a variable. So if the initializer in D was 
this:
-------
struct Foo
{
    int[64] data = repeat(1, 3, 64);
}
-------

in theory we should be able to just pass the initializer code in it's 
GENERIC form to DECL_INITILIZER. The GCC backend could then just generate 
the code for initialization.

So this then essentially is an initializer function, but of a more GCC 
readable kind. In some cases (Initialization of a global variable, maybe 
others) GCC would probably have to evaluate that code at compile time to 
obtain the data representation. That might be difficult, so maybe we have 
to consider this in the glue code and pass a complex expression/code 
based initializer in places where we can execute code but a data based 
initilizer where that's not possible.

Ideally, we pass both options to GCC and let GCC choose. The GCC backend 
code could be as simple as:

if (decl.initializer.isSymbol() && 
decl.initializer.symbol.hasInitializerExpression())
    // TODO: When to use expr vs. symbol?
    initializer = decl.initializer.symbol.initializerExpression;

> 
>> I don't see how an initializer function would be more flexible than
>> that. In fact, you could generate the initializer function in the
>> backend if information about the initialization expression is always
>> preserved. Constructing an initializer function earlier (in the
>> frontend, or D user code) removes information about the target
>> architecture (-Os, memory available, efficient addressing of local
>> constant data, ...). Because of that, I think the backend is the best
>> place to implement this and the frontend should just provide the symbol
>> initializer expression.
> 
> I'm a little confused because your last sentence is exactly what we
> currently do, with the terminology:  frontend = dmd code that outputs a
> semantically analyzed AST. Backend = DMD/GCC/LLVM codegen. Possibly with
> "glue layer intermediate representation" in-between.

When I said backend there, I meant the GCC, architecture dependent 
backend, not the glue layer. 

> What I thought is discussed in this thread, is that we move the
> complexity out of the compilers (so out of current backends) into
> druntime. For that, I think an initializer function is a good solution
> (similar to emitting a constructor function, rather than implementing
> that codegen inside the backend).

But how is a initializer function different to the backend from a tree of 
StructInitializer / ExpInitializer? This is a 1:1 representation of the 
default initializer as written by the user. If you were to write an 
initializer function, wouldn't you just wrap that initializer tree in a 
statement and into a function?

But the backend would still have to do exactly the same code 
transformation, with the main difference that it now has to generate a 
function, inline the function and it has less information about the 
function (e.g. an initializer tree can be evaluated at compile time / 
const in GCC terms, a function may not necessarily be, side effects, ...).

So it seems to me, just passing the initializer tree from frontend to 
glue layer is the most information-preserving solution.

Reflecting on this some more, I guess I finally understand your point 
about using a function. To summarize my points:
1 We do not get the expression initializer form if using multiple-files. 
  That's why I think the frontend should make the initializer expression 
  (StructInitializer) which provides expressions to initialize all fields 
  even for aggregates in non-root modules.
2 We always emit the initializer symbol and pay for the overhead ==> 
  comdat
3 One thing I didn't consider so far: CTFE constant folding of
  expressions in the expression based initializer: I guess that can
  destroy interesting information for the glue layer. So here we really
  want two things: A code based initializer expression, which never does
  CTFE constant folding. And a folded / evaluated expression to initialize 
global variables. 

So I guess if we decide we never need the symbol and drop point 2, the 
third point, a "non-CTFEd initializer expression" is probably pretty 
close to what you wanted as an initializer function. I just didn't think 
of it as a function...

OTOH my point about using a symbol to unify initializer storage used in 
multiple invocations across code units would also apply to expression 
based initializers: Having a function there would actually allow saving 
space in some cases compared to always inlining the expression. So maybe 
a comdat, usually-inlined but optionally available function (e.g. for -
Os) is a good idea...

I'm not sure if the GCC backend can handle an initilizer function (with 
known body) as well a a DECL_INITIAL in non-optimizing cases though. 
Maybe this needs some backend engineering in GCC. 
(DECL_FUNC(DECL_INITIAL(x) = ...) ?
-- 
Johannes