How to get to a class initializer through introspection?

Wed Aug 5 13:40:16 UTC 2020

Am Tue, 04 Aug 2020 10:13:53 +0000 schrieb Johannes Pfau:

> Am Tue, 04 Aug 2020 09:31:16 +0000 schrieb Johan:
> 
>> On Tuesday, 4 August 2020 at 02:03:34 UTC, Andrei Alexandrescu wrote:
>>> On 8/3/20 10:44 AM, Johan wrote:
>>>> On Monday, 3 August 2020 at 13:01:55 UTC, Andrei Alexandrescu wrote:
>>>>> Would it be effective to iterate through the .tupleof and initialize
>>>>> each in turn?
>>>> 
>>>> Possibly. IIRC, the spec obliges us to initialize the padding
>>>> in-between address-aligned members aswell, such that a memcmp works
>>>> to compare structs. If that is true, then we have to initialize the
>>>> padding aswell and a memcpy would be that much nicer.
>>>
>>> To play devil's advocate, the padding bytes should not have been
>>> changed by user code in the first place :o).
>> 
>> But the memory into which objects are placed will be tainted and thus
>> the padding areas will not be the same for each object. (it's the same
>> for =void members. All can be incorporated into the initializer
>> function, but it's work.)
>> 
>> -Johan
> 
> I wonder whether an initial memset + then initializing members may be a
> good solution? The compiler backends may be clever enough to optimize
> the memset (e.g. if there are no gaps, so it's completely redundant, if
> there is a single gap and explicitly filling that gap is more efficient
> than zeroing everything, ...).
> 
> However, in some cases a memcpy which copies both member initialization
> data and padding may be better? I'm not sure how to decide when which
> option is better or whether we can somehow have both...

A quick look at some generated ASM for C++ code suggests that GCC can 
"see through" memcpys if the copied data is "well known":

https://godbolt.org/z/jno9KM *

So if GCC actually knows which data will be memcpyed, it may rewrite the 
memcpy to assignments of statically known values. Or it may rewrite the 
memcpy into multiple assignments skipping holes, it may remove redundant 
writes (e.g. if a member is immediately written after initialization), ...

I'd therefore suggest the following:
1) Make all init symbols COMDAT: This ensures that if a smybol is 
actually needed (address taken, real memcpy call) it will be available. 
But if it is not needed, the compiler does not have to output the symbol. 
If it's required in multiple files, COMDAT will merge the symbols into 
one.

2) Ensure the compiler always knows the data of that symbol. This 
probably means during codegen, the initializer should never be an 
external symbol. It needs to be a COMDAT symbol with attached initializer 
expression. And the initializer data must always be fully available 
in .di files.

The two rules combined should allow the backend to choose the 
initialization method that is most appropriate for the target 
architecture.

To summarize, implementing "initializer functions" may prevent this 
optimization to some degree (depends on inlining and other factors 
though). So I'd probably prefer to keep compiler generated initializer 
symbols for aggregates, but make sure that these symbold always have an 
initializer expression attached, so the backend can choose which one to 
use.

In addition, there needs to be some well-defined way for user code to 
initialize variables and trigger these optimizations. Most likely 
__builtin_memset(p, 0, size) and __builtin_memcpy(p, &T.init, T.sizeof) 
would be fine though.

* Interesting that the most efficient way to return a default-initialized 
aggregate on X86 by value is to just return an address to the initializer. 
I guess the ABI copies anyway in the caller...

-- 
Johannes