emplace, scope, enforce [Was: Re: Manual...]
Rory McGuire
rmcguire at neonova.co.za
Wed Jul 21 00:23:08 PDT 2010
On Wed, 21 Jul 2010 03:58:33 +0200, bearophile <bearophileHUGS at lycos.com>
wrote:
> Andrei Alexandrescu:
>
>> emplace(), defined in std.conv, is relatively new. I haven't yet added
>> emplace() for class objects, and this is as good an opportunity as any:
>> http://www.dsource.org/projects/phobos/changeset/1752
>
> Thank you, I have used this, and later I have done few tests too.
>
> The "scope" for class instantiations can be deprecated once there is an
> acceptable alternative. You can't deprecate features before you have
> found a good enough alternative.
>
> ---------------------
>
> A first problem is the syntax, to allocate an object on the stack you
> need something like:
>
> // is testbuf correctly aligned?
> ubyte[__traits(classInstanceSize, Test)] testbuf = void;
> Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);
>
>
> That is too much worse looking, hairy and error prone than:
> scope Test t = new Test(arg1, arg2);
>
>
> I have tried to build a helper to improve the situation, like something
> that looks:
> Test t = StackAlloc!(Test, arg1, arg2);
>
> But failing that, my second try was this, not good enough:
> mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));
>
> ---------------------
>
> A second problem is that this program compiles with no errors:
>
> import std.conv: emplace;
>
> final class Test {
> int x, y;
> this(int xx, int yy) {
> this.x = xx;
> this.y = yy;
> }
> }
>
> Test foo(int x, int y) {
> ubyte[__traits(classInstanceSize, Test)] testbuf = void;
> Test t = emplace!(Test)(cast(void[])testbuf, x, y);
> return t;
> }
>
> void main() {
> foo(1, 2);
> }
>
>
>
> While the following one gives:
> test.d(13): Error: escaping reference to scope local t
>
>
> import std.conv: emplace;
>
> final class Test {
> int x, y;
> this(int xx, int yy) {
> this.x = xx;
> this.y = yy;
> }
> }
>
> Test foo(int x, int y) {
> scope t = new Test(x, y);
> return t;
> }
>
> void main() {
> foo(1, 2);
> }
>
>
> So the compiler is aware that the scoped object can't escape, while
> using emplace things become more bug-prone. "scope" can cause other
> bugs, time ago I have filed a bug report about one problem, but it
> avoids the most common bug. (I am not sure the emplace solves that
> problem with scope, I think it shares the same problem, plus adds new
> ones).
>
> ---------------------
>
> A third problem is that the ctor doesn't get called:
>
>
> import std.conv: emplace;
> import std.c.stdio: puts;
>
> final class Test {
> this() {
> }
> ~this() { puts("killed"); }
> }
>
> void main() {
> ubyte[__traits(classInstanceSize, Test)] testbuf = void;
> Test t = emplace!(Test)(cast(void[])testbuf);
> }
>
>
> That prints nothing. Using scope it gets called (even if it's not
> present!).
>
> ---------------------
>
> This is not a problem of emplace(), it's a problem of the dmd optimizer.
> I have done few tests for the performance too. I have used this basic
> pseudocode:
>
> while (i < Max)
> {
> create testObject(i, i, i, i, i, i)
> testObject.doSomething(i, i, i, i, i, i)
> testObject.doSomething(i, i, i, i, i, i)
> testObject.doSomething(i, i, i, i, i, i)
> testObject.doSomething(i, i, i, i, i, i)
> destroy testObject
> i++
> }
>
>
> Coming from here:
> http://www.drdobbs.com/java/184401976
> And its old timings:
> http://www.ddj.com/java/184401976?pgno=9
>
>
> The Java version of the code is simple:
>
> final class Obj {
> int i1, i2, i3, i4, i5, i6;
>
> Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
> this.i1 = ii1;
> this.i2 = ii2;
> this.i3 = ii3;
> this.i4 = ii4;
> this.i5 = ii5;
> this.i6 = ii6;
> }
>
> void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int
> ii6) {
> }
> }
>
> class Test {
> public static void main(String args[]) {
> final int N = 100_000_000;
> int i = 0;
> while (i < N) {
> Obj testObject = new Obj(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> // testObject = null; // makes no difference
> i++;
> }
> }
> }
>
>
>
> This is a D version that uses emplace() (if you don't use emplace here
> the performance of the D code is very bad compared to the Java one):
>
> // program #1
> import std.conv: emplace;
>
> final class Test { // 32 bytes each instance
> int i1, i2, i3, i4, i5, i6;
> this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
> this.i1 = ii1;
> this.i2 = ii2;
> this.i3 = ii3;
> this.i4 = ii4;
> this.i5 = ii5;
> this.i6 = ii6;
> }
> void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int
> ii6) {
> }
> }
>
> void main() {
> enum int N = 100_000_000;
>
> int i;
> while (i < N) {
> ubyte[__traits(classInstanceSize, Test)] buf = void;
> Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, i, i,
> i);
> // Test testObject = new Test(i, i, i, i, i, i);
> // scope Test testObject = new Test(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject.doSomething(i, i, i, i, i, i);
> testObject = null;
> i++;
> }
> }
>
>
> The Java code (server) runs in about 0.25 seconds here.
> The D code (that doesn't do heap allocations at all) run in about 3.60
> seconds.
>
> With a bit of experiments I have seen that emplace() doesn't get
> inlined, and the cause is it contains enforce(). enforce contains a
> throw, and it seems dmd doesn't inline functions that can throw, you can
> test it with a little test program like this:
>
>
> import std.c.stdlib: atoi;
> void foo(int b) {
> if (b)
> throw new Throwable(null);
> }
> void main() {
> int b = atoi("0");
> foo(b);
> }
>
>
> So if you comment out the two enforce() inside emplace() dmd inlines
> emplace() and the running time becomes about 2.30 seconds, less than ten
> times slower than Java.
>
> If emplace() doesn't contain calls to enforce() then the loop in main()
> becomes (dmd 2.047, optmized build):
>
>
> L1A: push dword ptr 02Ch[ESP]
> mov EDX,_D10test6_good4Test7__ClassZ[0Ch]
> mov EAX,_D10test6_good4Test7__ClassZ[08h]
> push EDX
> push ESI
> call near ptr _memcpy
> mov ECX,03Ch[ESP]
> mov 8[ECX],EBX
> mov 0Ch[ECX],EBX
> mov 010h[ECX],EBX
> mov 014h[ECX],EBX
> mov 018h[ECX],EBX
> mov 01Ch[ECX],EBX
> inc EBX
> add ESP,0Ch
> cmp EBX,05F5E100h
> jb L1A
>
>
> (The memcpy is done by emplace to initialize the object before calling
> its ctor. You must perform the initialization because it needs the
> pointer to the virtual table and monitor. The monitor here was null. I
> think a future LDC2 can optimize away more stuff in that loop, so it's
> not so bad).
>
>
> If you use this in program #1:
> scope Test testObject = new Test(i, i, i, i, i, i);
> It runs in about 6 seconds (also because the ctor is called even if's
> missing).
>
> If in program #1 you use just new, without scope, the runtime is about
> 27.2 seconds, about 110 times slower than Java.
>
> Bye,
> bearophile
Takes 18m27.720s in PHP :)
More information about the Digitalmars-d
mailing list