Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Thu Mar 18 19:50:25 UTC 2021

On Wednesday, 17 March 2021 at 01:06:45 UTC, Timon Gehr wrote:
> [...]

Lots of good stuff in there that I didn't quote fully to not spam.

This kind of lead me to look at things in a new way: The main 
problem here is that we have loosely defined requirement, and 
when this is the case, it is very easy to fool oneself and 
fullfill most requirement most of the time, but actually provide 
zero useful guarantee, be it to the dev or the optimizer.

So here, we have an existing system: ctor/dtor. The goal of this 
system is to ensure that an object is available to the program 
once it has been put in a proper state, and that for each object 
constructed, there will be a corresponding dtor call that will 
give an opportunity to undo this state.

The guarantee provided is that ctor/dtor go by pair and the 
compiler ensures this. What has been constructed will be 
destroyed and vice versa.

This causes a problem: what to do when an object is duplicated? 
Then it is required to construct a new object, from the previous 
one, and that new object, like the previous one, will be 
destroyed.

Because construction and destruction might be expensive, we want 
to be able to group a copy and a destruction operation of an 
object (granted there are no further use between that copy and 
that destruction) into one: a move operation.

Which lead us to a primitive set of requirements:
1/ Construction and destruction of object map 1:1
2/ An object cannot be used prior its construction and after its 
destruction.
3/ As an optimization, we want to be able to remove 
copy/destruction pair when the object is not used after the copy.

1/ and 2/ are already provided, but might be inefficient. 3/ is a 
way to make things more efficient, and in more way than what 
you'd think.

One notorious difference between C++ and D is that in C++, 
objects must have a fixed address, while in D, they do not. This 
means that the D compiler is free to move objects around. This 
has more consequences that one would expect, consider for 
instance the following sample code: https://godbolt.org/z/9hWzxb

It is clear from the disassembly that *2* dereferences are 
happening, when the C++ code only has one. How come?

Because structs in C++ are not movable by default, and the struct 
preexist the function call, it must be living somewhere in the 
caller's stack, and is passed by reference at the ABI level. To 
make it look like it is passed by value, the called will make a 
copy and then destroy it after the call.

In practice, this is worse than it look in this simplified 
exemple. Because not only this means that a vast number of 
dereferences are executed across the program, but this is only 
the 1st order effect. Because numerous things now go through 
pointers, it forces the optimizer to prove a ton of things using 
alias analysis that would be self evident without that extra 
indirection, and in many cases, it cannot. For instance:

void foo(unique<int> a, unique<int> b) {
   a = make_unique<int>(...);
   // The compiler has to assume that b may have been modified 
here, because both a and b could be the same object, and 
therefore b be modified by this new assignment.
}

This is a major problem of C++ object model. This is not a 
problem D has at the moment. This is simply the wrong default. 
This leaves us with one more requirement:

4/ Objects must be usable by value at the ABI level, including 
object with ctor/dtor.

The obvious problem with this are interior pointers, and I'll 
come back to them later on. It must however be understood that 
the vast majority of struct usage do not involve interior 
pointers, and therefore throwing away 4/ for interior pointer 
supports seems to be self defeating.

Another peculiarity of the C++ object model can be seen in this 
sample: https://godbolt.org/z/KGeffj

In the code generated, we can see that the smart pointer is 
allocated and initialized, then passed to the function by 
reference, which we expect. But what's interesting comes after 
the function call: there is a test and a branch. The generated 
code test the value of the smart pointer, and only free the 
memory is the pointer isn't null. for readability, I made the 
fuinction nothrow, but remove it and you'll see that a vast 
amount of code has to be generate for exception handling too.

This is happening because the function could have moved the 
object away. While the sensible way to handle an object that has 
been moved is to not use it at all, this was not possible for C++ 
for backward compatibility reasons. As a result, a moved object 
is put in a "null" state, where the destruction is a noop. This 
null state is the source of a ton of extra work by destructor and 
a ton of generated code for nothing.

If the function moves the object, then we want no destruction at 
all, and if it doesn't we want to destroy without checking 
against a null state. Obviously, we want the callee to do that as 
the caller doesn't have the infos, but that turns out to be a 
complex task when the object is passed by ref to the callee due 
to the previous problem.

This leave us with one more requirement:
5/ Object must not require a null state.

It is to be noted that the current DIP proposes to add 3 to D, 
but at the cost of either 1/ or 5/ being broken, which IMO is 
self defeating. Let's see why.
In the current proposed scheme, the move constructor or move 
assignment have the object things are moving from available. One 
of the following MUST happen:

a/ The previous object is left as this after the move. This means 
that 1/ is not ensured by default by the constructor anymore as 
any leftover won't be destroyed. It is easy to say that dev will 
be careful but it pretty much bound to fail, because it does the 
wrong thing by default - a change in the object might break 1/ 
silently - and it can do so non locally - a change to a member of 
a member of a member can break 1/ silently.

b/ The previous object is destroyed, but in this case, we ought 
to place in in a null state as we move. This is the approach C++ 
is taking and it break 5/ .

This problem is fundamentally unavoidable, because we have 2 
object when semantically we should only have one. We have to 
either do something with the leftovers (which break 5/) or ignore 
the leftover (which breaks 1/). There are no other options than 
do something or do nothing once that path is taken.

So, what do we want to do with move constructors anyways? Can't 
we just move the struct field by field recursively and be done 
with it? Yes, and I'd argue there is a problem if this isn't 
enough for 95% of the cases. Which leads to the two use cases I 
was able to identify:
  - Non movable struct. It is important that such a struct doesn't 
move. For instance, when the struct is some sort of header or a 
larger data segment. Another example is a struct that represent 
some kind of guard that needs to see its construction/destruction 
done in order. This can be achieved by disabling the move 
constructor, whatever the move constructor is defined as. It is 
fairly easy to realize such use case, the move constructor simply 
needs to exist at all.
  - Movable struct that require some form of bookkeeping. For 
these cases, a postblit would work with one exception: interior 
pointers.

What I refers as interior pointers are struct containing pointer 
to elements which are within the struct itself. While this idiom 
exist, it is vanishingly rare and becoming rarer over time. The 
main reason for this is that memory has become slower, 
computation faster, and pointer larger, which in turn lead people 
to use "relative pointers", namely pointer defined as an offset 
from this. Unless is is expected that the struct may be more than 
4GB in size - which is always the case, then it's all good. The 
extra addition required is well worth the memory saved (and 
increase hit rate in the cache that result from it). See 
https://www.youtube.com/watch?v=G3bpj-4tWVU for instance on how 
the swift runtime started using such techniques.

I'll be blunt, once these techniques are known, I've actually 
never encountered a case of interior pointers that would not be 
solved by disabling move altogether. I'm not pretending it 
doesn't exist, but I've never seen it. It simply doesn't make 
sense to sacrifice any of the above mentioned requirement for it, 
even it turns out this is really needed, because, well, this is 
the edge case of the edge case, and while enabling it might be an 
option, throwing away thing which are good in the general case 
for it just doesn't make sense.

I suspect that even then, making the struct unmovable and then 
definition custom method to move it manually would do the trick 
just fine. But just in case, here is what I propose: simply add 
an intrinsic, such as `void* __pre_move_address()` that can be 
called in the postblit, returning the address of the premove 
object. Any object using it would, of course, discard 4/ and not 
be usable as a value and instead always be passed by reference at 
the ABI level. This is the least constraining requirement to 
break, because it impact exclusively performances and never 
correctness like 1/ or 5/ would. However, considering it is 
possible to it custom once you disable move, I strongly suspect 
the bang is not worth the effort.