Feature: `static cast`
Quirin Schroll
qs.il.paperinik at gmail.com
Mon Jun 5 16:06:13 UTC 2023
On Friday, 2 June 2023 at 17:53:41 UTC, Paul Backus wrote:
> On Thursday, 1 June 2023 at 14:13:27 UTC, Quirin Schroll wrote:
>> There’s at least once case where I got bitten by 2. and it
>> cost me days to figure out that, contrary to the spec, casting
>> between `extern(C++)` class references isn’t checked at
>> runtime:
>> ```d
>> extern(C++) class C { void f() { } }
>> extern(C++) class D : C { }
>> void main() @safe
>> {
>> assert(cast(D)(new C) is null); // fails
>> }
>> ```
>>
>> [...] [issue 21690 (Unable to dynamic cast `extern(C++)`
>> classes)](https://issues.dlang.org/show_bug.cgi?id=21690) is
>> solved.
>
> As you say, this is a bug. The cast in the example *should*
> work the same way as `dynamic_cast<D*>(new C())` in C++, but it
> doesn't, because the D compiler doesn't know how to access (or
> generate) the C++ RTTI.
>
> Until the bug is fixed, casts like these should probably be a
> compile-time error ("Error: dynamic casting of C++ classes is
> not implemented").
Agreed.
> If you want the current behavior, which is a reinterpreting
> cast, the normal way to write that in D is with a pointer cast
> like `*cast(D*) &c` (which is, appropriately, `@system`).
I don’t think `*cast(D*) &c` is good. I’d use `cast(D)
cast(void*) c`, making unambiguously clear I want a type paint.
Also, one does not want a `reinterpret_cast`, one wants a
`static_cast`. Only in specific cases, the `reinterpret_cast`
happens to be the same as the `static_cast`.
Up until now, because the up-cast is a `static_cast`, I thought D
did a `static_cast` for a down-cast as well, but looking like a
`dynamic_cast`. That would be wrong, but only slightly wrong.
That in fact it does a `reinterpret_cast`, is terribly wrong.
> Of course, this would be a fairly disruptive breaking change,
> so it would require a deprecation period, but I think it would
> be worth it to disarm such an obvious footgun.
Yes. But it’s only surface-level. To reiterate, the fact that one
direction is a `static_cast` while the other is a
`reinterpret_cast` is a terrible footgun.
D doesn’t offer a `static_cast`, but it should, for two reasons:
* One might want a `static_cast` instead of a `dynamic_cast` for
performance reasons.
* As long as D has no C++ RTTI solution, the best it should offer
for a down-cast is a `static_cast` and that one shouldn’t
syntactically look like a dynamic cast, in fact, if it were to
work with `extern(D)` classes, it cannot look like a dynamic
cast. A `static_cast` is only equivalent to a `reinterpret_cast`
when using single-inheritance; to be precise, non-virtual single
inheritance. C++ and D support multi-inheritance. D only supports
interfaces, but it still counts, example below; also, interfaces
are virtual inheritance.
For `extern(C++)` classes there is no down-cast except a
`reinterpret_cast` and a `reinterpret_cast` does not do the Right
Thing except when it does by happenstance. That is the problem.
For `extern(D)` classes, the `dynamic_cast` does the Right Thing.
Occasionally, it’s only more expensive than it needs to be, but
at least it’s there.
```d
// compile with -dip1000
extern(C++) interface I1 { int x() const scope @nogc @safe; }
extern(C++) interface I2 { int y() const scope @nogc @safe; }
extern(C++) class C : I1, I2
{
private int _x, _y;
this(int x, int y) @nogc @safe pure { _x = x, _y = y; }
override int x() const scope @nogc @safe => _x;
override int y() const scope @nogc @safe => _y;
}
void main() @nogc @safe
{
import std.stdio;
scope C c = new C(1, 2);
I1 i1 = cast(I1) c;
I2 i2 = cast(I2) c;
assert(i1.x == 1);
assert(i2.y == 2);
() @trusted
{
C c1 = cast(C) i1;
assert(c1.x == 1); // ok
//assert(c1.y == 2); // fails??? (y is random garbage)
C c2 = cast(C) i2;
//assert(c2.x == 1); // fails (!)
assert(c2.x == 2); // passes (!)
//assert(c2.y == 2); // sefault (!)
}();
}
```
* The first two asserts prove that the cast from `C` up to `I1`
and `I2` is a `static_cast`. Variable `i2` refers to the `I2`
subobject of `c`, not the `I1` subobject as it would if it were a
`reinterpret_cast`. The `cast()` is in fact unnecessary.
* The `@trusted` asserts prove that the cast from `I1` and `I2`
down to `C` is a `reinterpret_cast`. It’s unsafe of course, and
it does not do the Right Thing as the second bunch of asserts
shows: `cast(C) i2` is a misaligned pointer, its `.y` component
leads to a segmentation fault. How do I get from `i2`, knowing
it’s in fact a `C` object, to its `x` component? With a
`static_cast` - which D doesn’t have.
* `c`, `c1` and `c2` point to different addresses. Were `I1` not
an interface but a class and its `x()` an `abstract` function, at
least the random garbage wouldn’t occur. This is the special case
where the `reinterpret_cast` happens to be the same as a
`static_cast`.
This is how C++ does it:
```cpp
#include <iostream>
struct B1 { int x; virtual ~B1() = default; B1(int x) : x{x} {} };
struct B2 { int y; virtual ~B2() = default; B2(int y) : y{y} {} };
struct D : B1, B2 { D(int x, int y) : B1{x}, B2{y} { } };
int main()
{
D d{ 1, 2 };
B1* b1s = static_cast<B1*>(&d);
B2* b2s = static_cast<B2*>(&d);
B1* b1r = reinterpret_cast<B1*>(&d);
B2* b2r = reinterpret_cast<B2*>(&d);
std::cout << b1s->x << std::endl; // 1
std::cout << b2s->y << std::endl; // 2
std::cout << b1r->x << std::endl; // 1 (reinterpret_cast is a
type paint)
std::cout << b2r->y << std::endl; // 1 (reinterpret_cast is a
type paint)
D* d1s = static_cast<D*>(b1s);
D* d2s = static_cast<D*>(b2s);
std::cout << d1s->x << std::endl; // 1
std::cout << d2s->y << std::endl; // 2
//B2* b2b1s = static_cast<B2*>(b1s); // error: static_cast
cannot cast sideways
//B1* b1b2s = static_cast<B1*>(b2s); // error: (same)
B1* b2b1d = dynamic_cast<B1*>(b2s); // dynamic_cast can cast
sideways
B2* b1b2d = dynamic_cast<B2*>(b1s); // (same)
std::cout << b2b1d->x << std::endl; // 1
std::cout << b1b2d->y << std::endl; // 2
B1* b2b1r = reinterpret_cast<B1*>(b2s); // reinterpret_cast
is a type paint
B2* b1b2r = reinterpret_cast<B2*>(b1s); // (same)
std::cout << b2b1r->x << std::endl; // 2 (!)
std::cout << b1b2r->y << std::endl; // 1 (!)
}
```
D is supposed to do a `reinterpret_cast` when pointers (usually
`void*`) are involved. `reinterpret_cast` is inherently unsafe,
in fact, it’s so unsafe, the C++ standard bans it from being used
in `constexpr`.
The `reinterpret_cast` between `B1` and `D` is fine because the
`B1` subobject has the same address as the `D` object.
The `reinterpret_cast` between `B2` and `D` or `B1` is fine only
because `B2` has the exact same layout as `B1` and `D` starts
with the `B1` subobject. (Were `B2` to contain a `float` instead
of an `int`, it would be undefined behavior to access it.) As the
example demonstrates, a `reinterpret_cast` is a type paint.
A `static_cast` gives you a pointer to the correct subobject;
it’s not just a type paint, it does a pointer adjustment. For an
up-cast, this is 100% safe (cf. the up-`reinterpret_cast`
generally is not). For a down-cast, it’s not safe generally, but
if the object actually is of the (derived) type the cast targets,
the cast is defined behavior. A `static_cast` cannot do a
sideways cast.
More information about the Digitalmars-d
mailing list