Feature: `static cast`

Quirin Schroll qs.il.paperinik at gmail.com
Mon Jun 5 16:06:13 UTC 2023


On Friday, 2 June 2023 at 17:53:41 UTC, Paul Backus wrote:
> On Thursday, 1 June 2023 at 14:13:27 UTC, Quirin Schroll wrote:
>> There’s at least once case where I got bitten by 2. and it 
>> cost me days to figure out that, contrary to the spec, casting 
>> between `extern(C++)` class references isn’t checked at 
>> runtime:
>> ```d
>> extern(C++) class C { void f() { } }
>> extern(C++) class D : C { }
>> void main() @safe
>> {
>>     assert(cast(D)(new C) is null); // fails
>> }
>> ```
>>
>> [...] [issue 21690 (Unable to dynamic cast `extern(C++)` 
>> classes)](https://issues.dlang.org/show_bug.cgi?id=21690) is 
>> solved.
>
> As you say, this is a bug. The cast in the example *should* 
> work the same way as `dynamic_cast<D*>(new C())` in C++, but it 
> doesn't, because the D compiler doesn't know how to access (or 
> generate) the C++ RTTI.
>
> Until the bug is fixed, casts like these should probably be a 
> compile-time error ("Error: dynamic casting of C++ classes is 
> not implemented").

Agreed.

> If you want the current behavior, which is a reinterpreting 
> cast, the normal way to write that in D is with a pointer cast 
> like `*cast(D*) &c` (which is, appropriately, `@system`).

I don’t think `*cast(D*) &c` is good. I’d use `cast(D) 
cast(void*) c`, making unambiguously clear I want a type paint. 
Also, one does not want a `reinterpret_cast`, one wants a 
`static_cast`. Only in specific cases, the `reinterpret_cast` 
happens to be the same as the `static_cast`.

Up until now, because the up-cast is a `static_cast`, I thought D 
did a `static_cast` for a down-cast as well, but looking like a 
`dynamic_cast`. That would be wrong, but only slightly wrong. 
That in fact it does a `reinterpret_cast`, is terribly wrong.

> Of course, this would be a fairly disruptive breaking change, 
> so it would require a deprecation period, but I think it would 
> be worth it to disarm such an obvious footgun.

Yes. But it’s only surface-level. To reiterate, the fact that one 
direction is a `static_cast` while the other is a 
`reinterpret_cast` is a terrible footgun.

D doesn’t offer a `static_cast`, but it should, for two reasons:
* One might want a `static_cast` instead of a `dynamic_cast` for 
performance reasons.
* As long as D has no C++ RTTI solution, the best it should offer 
for a down-cast is a `static_cast` and that one shouldn’t 
syntactically look like a dynamic cast, in fact, if it were to 
work with `extern(D)` classes, it cannot look like a dynamic 
cast. A `static_cast` is only equivalent to a `reinterpret_cast` 
when using single-inheritance; to be precise, non-virtual single 
inheritance. C++ and D support multi-inheritance. D only supports 
interfaces, but it still counts, example below; also, interfaces 
are virtual inheritance.

For `extern(C++)` classes there is no down-cast except a 
`reinterpret_cast` and a `reinterpret_cast` does not do the Right 
Thing except when it does by happenstance. That is the problem.

For `extern(D)` classes, the `dynamic_cast` does the Right Thing. 
Occasionally, it’s only more expensive than it needs to be, but 
at least it’s there.

```d
// compile with -dip1000
extern(C++) interface I1 { int x() const scope @nogc @safe; }
extern(C++) interface I2 { int y() const scope @nogc @safe; }
extern(C++) class C : I1, I2
{
     private int _x, _y;
     this(int x, int y) @nogc @safe pure { _x = x, _y = y; }
     override int x() const scope @nogc @safe => _x;
     override int y() const scope @nogc @safe => _y;
}

void main() @nogc @safe
{
     import std.stdio;

     scope C c = new C(1, 2);
     I1 i1 = cast(I1) c;
     I2 i2 = cast(I2) c;
     assert(i1.x == 1);
     assert(i2.y == 2);
     () @trusted
     {
         C c1 = cast(C) i1;
     	assert(c1.x == 1); // ok
         //assert(c1.y == 2); // fails??? (y is random garbage)

         C c2 = cast(C) i2;
         //assert(c2.x == 1); // fails (!)
     	assert(c2.x == 2); // passes (!)
         //assert(c2.y == 2); // sefault (!)
     }();
}
```

* The first two asserts prove that the cast from `C` up to `I1` 
and `I2` is a `static_cast`. Variable `i2` refers to the `I2` 
subobject of `c`, not the `I1` subobject as it would if it were a 
`reinterpret_cast`. The `cast()` is in fact unnecessary.
* The `@trusted` asserts prove that the cast from `I1` and `I2` 
down to `C` is a `reinterpret_cast`. It’s unsafe of course, and 
it does not do the Right Thing as the second bunch of asserts 
shows: `cast(C) i2` is a misaligned pointer, its `.y` component 
leads to a segmentation fault. How do I get from `i2`, knowing 
it’s in fact a `C` object, to its `x` component? With a 
`static_cast` - which D doesn’t have.
* `c`, `c1` and `c2` point to different addresses. Were `I1` not 
an interface but a class and its `x()` an `abstract` function, at 
least the random garbage wouldn’t occur. This is the special case 
where the `reinterpret_cast` happens to be the same as a 
`static_cast`.

This is how C++ does it:
```cpp
#include <iostream>

struct B1 { int x; virtual ~B1() = default; B1(int x) : x{x} {} };
struct B2 { int y; virtual ~B2() = default; B2(int y) : y{y} {} };
struct D : B1, B2 { D(int x, int y) : B1{x}, B2{y} { } };

int main()
{
     D d{ 1, 2 };
     B1* b1s = static_cast<B1*>(&d);
     B2* b2s = static_cast<B2*>(&d);
     B1* b1r = reinterpret_cast<B1*>(&d);
     B2* b2r = reinterpret_cast<B2*>(&d);
     std::cout << b1s->x << std::endl; // 1
     std::cout << b2s->y << std::endl; // 2
     std::cout << b1r->x << std::endl; // 1 (reinterpret_cast is a 
type paint)
     std::cout << b2r->y << std::endl; // 1 (reinterpret_cast is a 
type paint)

     D* d1s = static_cast<D*>(b1s);
     D* d2s = static_cast<D*>(b2s);
     std::cout << d1s->x << std::endl; // 1
     std::cout << d2s->y << std::endl; // 2

     //B2* b2b1s = static_cast<B2*>(b1s); // error: static_cast 
cannot cast sideways
     //B1* b1b2s = static_cast<B1*>(b2s); // error: (same)

     B1* b2b1d = dynamic_cast<B1*>(b2s); // dynamic_cast can cast 
sideways
     B2* b1b2d = dynamic_cast<B2*>(b1s); // (same)
     std::cout << b2b1d->x << std::endl; // 1
     std::cout << b1b2d->y << std::endl; // 2

     B1* b2b1r = reinterpret_cast<B1*>(b2s); // reinterpret_cast 
is a type paint
     B2* b1b2r = reinterpret_cast<B2*>(b1s); // (same)
     std::cout << b2b1r->x << std::endl; // 2 (!)
     std::cout << b1b2r->y << std::endl; // 1 (!)
}
```

D is supposed to do a `reinterpret_cast` when pointers (usually 
`void*`) are involved. `reinterpret_cast` is inherently unsafe, 
in fact, it’s so unsafe, the C++ standard bans it from being used 
in `constexpr`.

The `reinterpret_cast` between `B1` and `D` is fine because the 
`B1` subobject has the same address as the `D` object.

The `reinterpret_cast` between `B2` and `D` or `B1` is fine only 
because `B2` has the exact same layout as `B1` and `D` starts 
with the `B1` subobject. (Were `B2` to contain a `float` instead 
of an `int`, it would be undefined behavior to access it.) As the 
example demonstrates, a `reinterpret_cast` is a type paint.

A `static_cast` gives you a pointer to the correct subobject; 
it’s not just a type paint, it does a pointer adjustment. For an 
up-cast, this is 100% safe (cf. the up-`reinterpret_cast` 
generally is not). For a down-cast, it’s not safe generally, but 
if the object actually is of the (derived) type the cast targets, 
the cast is defined behavior. A `static_cast` cannot do a 
sideways cast.


More information about the Digitalmars-d mailing list