[Issue 14912] New: Move initialisation of GC'd struct and class data from the callee to the caller
via Digitalmars-d-bugs
digitalmars-d-bugs at puremagic.com
Wed Aug 12 14:53:35 PDT 2015
https://issues.dlang.org/show_bug.cgi?id=14912
Issue ID: 14912
Summary: Move initialisation of GC'd struct and class data from
the callee to the caller
Product: D
Version: D2
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P1
Component: dmd
Assignee: nobody at puremagic.com
Reporter: ibuclaw at gdcproject.org
Currently, druntime will initialise all GC'd data in the caller.
Examples:
_d_newclass():
p[0 .. ci.init.length] = ci.init[];
_d_newitemT():
memset(p, 0, _ti.tsize);
_d_newitemiT():
memcpy(p, init.ptr, init.length);
In each example, results in a system call. And because the implementation is
always hidden away, the optimizer (or an optimizing backend) cannot assume
anything about the contents of the pointer returned in these calls.
For instance, in very simple case:
class A
{
int foo () { return 42; }
}
int test()
{
A a = new A(), b = a;
return b.foo();
}
If the contents of 'a' set by the caller in the compiler, we would have the
following codegen (pseudo-code):
int test()
{
struct A *a;
struct A *b;
a = new A();
*a = A.init;
b = a;
return b.__vptr.foo(b);
}
>From that, an optimizer can break down and inline the default initializer
without the need for memset/memcpy:
// ...
a = new A();
a.__vptr = &typeid(A).vtbl
a.__monitor = null;
// ...
Perform constant propagation to replace all occurrences of b with a:
// ...
return *(a.__vptr + 40)(a);
// ...
Global value numbering to resolve the lookup in the vtable, and de-virtualize
the call:
// ...
return A.foo(a);
// ...
After some dead code removal, the inliner now sees the direct call and is ready
to inline A.foo:
int test()
{
struct A *a = new A();
a.__vptr = typeid(A).vtbl.ptr
a.__monitor = null;
return 42;
}
There is another challenge here to remove the dead GC allocation (that will
have to wait for another bug report). But I think that this simple change is
justified by the opportunity to produce much better resulting code when using
classes in at least simple ways - haven't even considered possibilities when
considering LTO.
If there's no objections, I suggest that we should make a push for this. It
will require dmd to update its own NewExp::toElem, and to remove the
memcpy/memset parts from druntime.
--
More information about the Digitalmars-d-bugs
mailing list