Constructing a class in-place

Thu Jul 26 21:22:45 UTC 2018

On Thursday, 26 July 2018 at 12:45:52 UTC, Johan Engelen wrote:
> On Wednesday, 25 July 2018 at 08:11:59 UTC, rikki cattermole 
> wrote:
>>
>> Standard solution[0].
>>
>> [0] https://dlang.org/phobos/std_conv.html#.emplace.4
>
> Thanks for pointing to D's placement new. This is bad news for 
> my devirtualization work; before, I thought D is in a better 
> situation than C++, but now it seems we may be worse off.
>
> Before I continue the work, I'll have to look closer at this 
> (perhaps write an article about the situation in D, so more ppl 
> can help and see what is going on). In short:
> C++'s placement new can change the dynamic type of an object, 
> which is problematic for devirtualization. However, in C++ the 
> pointer passed to placement new may not be used afterwards 
> (it'd be UB). This means that the code `A* a = new A(); 
> a->foo(); a->foo();` is guaranteed to call the same function 
> `A::foo` twice, because if the first call to `foo` would do a 
> placement new on `a` (e.g. through `this`), the second call 
> would be UB.
> In D, we don't have placement new, great! And now, I learn that 
> the _standard library_ _does_ have something that looks like 
> placement new, but without extra guarantees of the spec that 
> C++ has.
> For some more info:
> https://stackoverflow.com/a/49569305
> https://stackoverflow.com/a/48164192
>
> - Johan

Please excuse if my question is too naive, but how does this 
change anything?

The general pattern of using classes is:
1. Allocate memory. This can be either:
   1.a) implicit dynamic heap allocation done by the call to 
`GC.malloc` invoked via the implementation of the `new` operator 
for classes.

   1.b) explicit dynamic heap allocation via any allocator 
(`GC.malloc`, libc, std.experimental.allocator, etc.)
(1.b) is also a special case for class created via `new` - COM 
classes are allocated via malloc - see: 
https://github.com/dlang/druntime/blob/cb5efa9854775c5a72acd6870083b16e5ebba369/src/rt/lifetime.d#L79)

   1.c) implicit stack allocation via `scope c = new Class();`
   1.d) implicit stack allocation via struct wrapper like `auto c 
= scoped!Class();`
   1.e) explicit stack allocation via 
`void[__traits(classInstanceSize, A)] buf = void;`
   1.f) explicit stack allocation via `void[] buf = 
alloca(__traits(classInstanceSize, A))[0 .. 
__traits(classInstanceSize, A)];`
   1.g) static allocation as thread-local or global variable or a 
part of one via implace buffer. To be honest I'm not sure how 
compilers implement this today.

   1.e) Or any of the many variations of the above.

2. Explicit or implicit initialization its vtable, monitor (if 
the class is or derived from Object) and its fields: `buf[] = 
typeid(Class).initializer[];`

3. The class constructor is invoked, which in turn may require 
calls to one more base classes.

...

4. The class is destroyed
   4.a) Implicitly via the GC
   4.b) Explicitly via `core.memory.__delete()`
   4.b) Explicitly via `destroy()`
   4.c) Explicitly via `std.experimental.allocator.dispose`, or 
any similar allocator wrapper.

5. The class instance memory may be freed.

At the end of the day, the destructor is called and potentially 
the memory is freed (e.g. if it's dynamically allocated). Nothing 
stops the same bytes from being reused for another object of a 
different type.

<slightly-off-topic>
C++ has the two liberties that D does not have or should/needs to 
have:
A. The C++ standard is very hand-wavy about the abstract machine 
on which C++ programs are semantically executing giving special 
powers to its standard library to implement features that can't 
be expressed with standard C++.

B. Its primary target audience of expert only programmers can 
tolerate the extremely dense minefield of undefined behavior that 
the standard committee doesn't shy from from putting behind each 
corner in the name of easier development of 'sufficiently smart 
compilers'. I'm talking about things like 
https://en.cppreference.com/w/cpp/utility/launder which most C 
programmers (curiously, 'C != C++') would consider truly bjorked.

</slightly-off-topic>

D on the other hand is (or at least I'm hopeful that it is) 
moving away giving magical powers to its runtime or standard 
library and is its embracing the spirit of bare bones systems 
programming where the programmer is allowed or even encouraged to 
implement everything from scratch (cref -betterC) for when that 
is the most sensible option.
While C and C++ approach portability by abstracting the machine, 
the approaches portability by laying all the cards on the table 
and defining things, rather than letting them be unspecified or 
at least documenting the implementation definition.

What I'm trying to say is that 'new' is not as special in D as it 
is in C++ (ironically, as the 'new'-ed objects are GC-ed in D, 
and what could be more magical in a language spec than a GC) and 
given the ongoing @nogc long-term campaign its use is even 
becoming discouraged.
Given this trend, the abundance of templates, increasing 
availability of LTO and library-defined allocation and 
object/resource management schemes I think it's more and more 
likely that compilers will be see the full picture of class 
lifetime and should either treat 1, 2, 3, 4 and 5 with C 
semantics (don't make any assumptions) or try to detect instances 
of 4 and 5 and mark the end of the object's lifetime in the 
compiler to allow aliasing of its storage as a potentially 
different type.