How about some __initialize magic?
Stanislav Blinov
stanislav.blinov at gmail.com
Sat Nov 27 21:56:05 UTC 2021
D lacks syntax for initializing the uninitialized. We can do this:
```d
T stuff = T(args); // or new T(args);
```
but this?..
```d
T* ptr = allocateForT();
// now what?.. Can't just do *ptr = T(args) - that's an
assignment, not initialization!
// is T a struct? A union? A class? An int?.. Is it even a
constructor call?..
```
This is, uh, "solved", using library functions -
`emplaceInitializer`, `emplace`, `copyEmplace`, `moveEmplace`.
The fact that there are __four__ functions to do this should
already ring a bell, but if one was to look at how e.g. the
`emplace` is implemented, there's lots and lots more to it -
classes or structs? Constructor or no constructor? Postblit?
Copy?.. And all the delegation... A single call to `emplace` may
copy the bits around more than once. Talk about initializing a
static array... Or look at `emplaceInitializer`, which the other
three all depend upon: it is, currently, built on a hack just to
avoid blowing up the stack (which is, ostensibly, what previous
less hacky hack lead to). Upcoming `__traits(initSymbol)` would
help in removing the hack, but won't help CTFE any. At various
points of their lives, these things even explicitly called
`memcpy`, which is just... argh! And some still do
(`copyEmplace`, I'm looking at you). Call into CRT to blit a
8-byte struct? With statically known size and alignment? Just to
sidestep type system? Eh??? Much fun for copying arrays!
...And still, none of them would work in CTFE for many types, due
to various implementation quirks (which include those very calls
to memcpy, or reinterpret casts). This one could, potentially, be
solved with more barbed wire and swear words, that is, code,
but...
Thing is, all those functions are re-implementing what the
compiler can already do, but in a library. Or rather, come very
close to doing that, but still don't really get there. C++ with
its library solution does this better!
What if the language specified a "magic" function, called, say,
`__initialize`, that would just do the right thing (tm)? Given an
lvalue, it would instruct the compiler to generate code writing
initializer, bliting, copying, or calling the appropriate
constructor with the arguments. And most importantly, would work
in CTFE regardless of type, and not require weird dances around
T.init, dummy types involving extra argument copies, or manual
fieldwise and elementwise blits (which is what one would have to
do in order to e.g. make `copyEmplace` CTFE-able).
I.e:
```d
// Write .init
T* raw0 = allocateForT();
// currently - emplaceInitializer(raw0);
(*raw0).__initialize;
// Initialize fields or call constructor, whichever is applicable
for T(arg1, arg2)
T* raw1 = allocateForT();
// currently - raw1.emplace(forward!(arg1, arg2));
(*raw1).__initialize(forward!(arg1, arg2));
// Copy
T* raw2 = allocateForT();
// currently - copyEmplace(*raw1, *raw2);
(*raw2).__initialize(*raw1);
// Move
T* raw3 = allocateForT();
// currently - moveEmplace(*raw2, *raw3);
(*raw3).__initialize(move(*raw2));
// Could be called at runtime or during CTFE
auto createArray()
{
// big array, don't initialize
const(T)[1000] result = void;
// exception handling omitted for brevity
foreach (i, ref it; result)
{
// currently - `emplace`, which may fail to compile in CTFE
it.__initialize(createIthElement(i));
}
return result;
}
// CTFE use case:
static auto array = createArray();
```
The wins are obvious - unified syntax, better error messages,
CTFE support, less library voodoo failing at mimicking the
compiler. The losses? I don't see any.
Note that I am not talking about yet another library function.
This would not be a symbol in druntime, this would be compiler
magic. Having that, `emplaceInitializer`, `emplace` and
`copyEmplace` could be re-implemented in terms of `__initialize`,
and eventually deprecated and removed. `moveEmplace` could linger
until DIP1040 is implemented, tried, and proven. The `move`
example, verbatim, would be pessimized compared to `moveEmplace`
due to moving twice, which hopefully DIP1040 could solve.
I'm a bit hesitant to suggest how this should interact with
`@safe`. On one hand, the established precedent is in `emplace` -
it infers, and I'm leaning towards that, even though it can
potentially invalidate existing state. On the other hand, because
it can indeed invalidate existing state, it should be `@system`.
But then it would require some additional facility just for
inference, so it could be called `@trusted` correctly, otherwise
it'd be useless. And that facility, whatever it is, better not be
another library reincarnation of all required semantics. For
example, something like a `__traits(isSafeToInitWith, T, args)`.
Whichever the approach, it should definitely infer all other
attributes.
There are undoubtedly other things to consider. For example -
classes. It would seem prudent for this hypothetical
`__initialize` to be calling class ctors. On the other, a
reference itself is just a POD, and generic code might indeed
want to write null as opposed to attempting to call a default
constructor. Then again, generic code still would have to
specialize for classes... Thoughts welcome.
What do you think? DIP this, yay or nay? Suggestions?..
More information about the Digitalmars-d
mailing list