emplace, scope, enforce [Was: Re: Manual...]

Rory Mcguire rjmcguire at gm_no_ail.com
Wed Jul 21 01:08:22 PDT 2010


Rory McGuire wrote:

> On Wed, 21 Jul 2010 03:58:33 +0200, bearophile 
<bearophileHUGS at lycos.com>
> wrote:
> 
>> Andrei Alexandrescu:
>>
>>> emplace(), defined in std.conv, is relatively new. I haven't yet 
added
>>> emplace() for class objects, and this is as good an opportunity as 
any:
>>> http://www.dsource.org/projects/phobos/changeset/1752
>>
>> Thank you, I have used this, and later I have done few tests too.
>>
>> The "scope" for class instantiations can be deprecated once there 
is an
>> acceptable alternative. You can't deprecate features before you 
have
>> found a good enough alternative.
>>
>> ---------------------
>>
>> A first problem is the syntax, to allocate an object on the stack 
you
>> need something like:
>>
>> // is testbuf correctly aligned?
>> ubyte[__traits(classInstanceSize, Test)] testbuf = void;
>> Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);
>>
>>
>> That is too much worse looking, hairy and error prone than:
>> scope Test t = new Test(arg1, arg2);
>>
>>
>> I have tried to build a helper to improve the situation, like 
something
>> that looks:
>> Test t = StackAlloc!(Test, arg1, arg2);
>>
>> But failing that, my second try was this, not good enough:
>> mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));
>>
>> ---------------------
>>
>> A second problem is that this program compiles with no errors:
>>
>> import std.conv: emplace;
>>
>> final class Test {
>>     int x, y;
>>     this(int xx, int yy) {
>>         this.x = xx;
>>         this.y = yy;
>>     }
>> }
>>
>> Test foo(int x, int y) {
>>     ubyte[__traits(classInstanceSize, Test)] testbuf = void;
>>     Test t = emplace!(Test)(cast(void[])testbuf, x, y);
>>     return t;
>> }
>>
>> void main() {
>>     foo(1, 2);
>> }
>>
>>
>>
>> While the following one gives:
>> test.d(13): Error: escaping reference to scope local t
>>
>>
>> import std.conv: emplace;
>>
>> final class Test {
>>     int x, y;
>>     this(int xx, int yy) {
>>         this.x = xx;
>>         this.y = yy;
>>     }
>> }
>>
>> Test foo(int x, int y) {
>>     scope t = new Test(x, y);
>>     return t;
>> }
>>
>> void main() {
>>     foo(1, 2);
>> }
>>
>>
>> So the compiler is aware that the scoped object can't escape, while
>> using emplace things become more bug-prone. "scope" can cause other
>> bugs, time ago I have filed a bug report about one problem, but it
>> avoids the most common bug. (I am not sure the emplace solves that
>> problem with scope, I think it shares the same problem, plus adds 
new
>> ones).
>>
>> ---------------------
>>
>> A third problem is that the ctor doesn't get called:
>>
>>
>> import std.conv: emplace;
>> import std.c.stdio: puts;
>>
>> final class Test {
>>     this() {
>>     }
>>     ~this() { puts("killed"); }
>> }
>>
>> void main() {
>>     ubyte[__traits(classInstanceSize, Test)] testbuf = void;
>>     Test t = emplace!(Test)(cast(void[])testbuf);
>> }
>>
>>
>> That prints nothing. Using scope it gets called (even if it's not
>> present!).
>>
>> ---------------------
>>
>> This is not a problem of emplace(), it's a problem of the dmd 
optimizer.
>> I have done few tests for the performance too. I have used this 
basic
>> pseudocode:
>>
>> while (i < Max)
>> {
>>    create testObject(i, i, i, i, i, i)
>>    testObject.doSomething(i, i, i, i, i, i)
>>    testObject.doSomething(i, i, i, i, i, i)
>>    testObject.doSomething(i, i, i, i, i, i)
>>    testObject.doSomething(i, i, i, i, i, i)
>>    destroy testObject
>>    i++
>> }
>>
>>
>> Coming from here:
>> http://www.drdobbs.com/java/184401976
>> And its old timings:
>> http://www.ddj.com/java/184401976?pgno=9
>>
>>
>> The Java version of the code is simple:
>>
>> final class Obj {
>>     int i1, i2, i3, i4, i5, i6;
>>
>>     Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
>>         this.i1 = ii1;
>>         this.i2 = ii2;
>>         this.i3 = ii3;
>>         this.i4 = ii4;
>>         this.i5 = ii5;
>>         this.i6 = ii6;
>>     }
>>
>>     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, 
int
>> ii6) {
>>     }
>> }
>>
>> class Test {
>>     public static void main(String args[]) {
>>         final int N = 100_000_000;
>>         int i = 0;
>>         while (i < N) {
>>             Obj testObject = new Obj(i, i, i, i, i, i);
>>             testObject.doSomething(i, i, i, i, i, i);
>>             testObject.doSomething(i, i, i, i, i, i);
>>             testObject.doSomething(i, i, i, i, i, i);
>>             testObject.doSomething(i, i, i, i, i, i);
>>             // testObject = null; // makes no difference
>>             i++;
>>         }
>>     }
>> }
>>
>>
>>
>> This is a D version that uses emplace() (if you don't use emplace 
here
>> the performance of the D code is very bad compared to the Java 
one):
>>
>> // program #1
>> import std.conv: emplace;
>>
>> final class Test { // 32 bytes each instance
>>     int i1, i2, i3, i4, i5, i6;
>>     this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
>>         this.i1 = ii1;
>>         this.i2 = ii2;
>>         this.i3 = ii3;
>>         this.i4 = ii4;
>>         this.i5 = ii5;
>>         this.i6 = ii6;
>>     }
>>     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, 
int
>> ii6) {
>>     }
>> }
>>
>> void main() {
>>     enum int N = 100_000_000;
>>
>>     int i;
>>     while (i < N) {
>>         ubyte[__traits(classInstanceSize, Test)] buf = void;
>>         Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, 
i, i,
>> i);
>>         // Test testObject = new Test(i, i, i, i, i, i);
>>         // scope Test testObject = new Test(i, i, i, i, i, i);
>>         testObject.doSomething(i, i, i, i, i, i);
>>         testObject.doSomething(i, i, i, i, i, i);
>>         testObject.doSomething(i, i, i, i, i, i);
>>         testObject.doSomething(i, i, i, i, i, i);
>>         testObject = null;
>>         i++;
>>     }
>> }
>>
>>
>> The Java code (server) runs in about 0.25 seconds here.
>> The D code (that doesn't do heap allocations at all) run in about 
3.60
>> seconds.
>>
>> With a bit of experiments I have seen that emplace() doesn't get
>> inlined, and the cause is it contains enforce(). enforce contains a
>> throw, and it seems dmd doesn't inline functions that can throw, 
you can
>> test it with a little test program like this:
>>
>>
>> import std.c.stdlib: atoi;
>> void foo(int b) {
>>     if (b)
>>         throw new Throwable(null);
>> }
>> void main() {
>>     int b = atoi("0");
>>     foo(b);
>> }
>>
>>
>> So if you comment out the two enforce() inside emplace() dmd 
inlines
>> emplace() and the running time becomes about 2.30 seconds, less 
than ten
>> times slower than Java.
>>
>> If emplace() doesn't contain calls to enforce() then the loop in 
main()
>> becomes (dmd 2.047, optmized build):
>>
>>
>> L1A:		push	dword ptr 02Ch[ESP]
>> mov	EDX,_D10test6_good4Test7__ClassZ[0Ch]
>> mov	EAX,_D10test6_good4Test7__ClassZ[08h]
>> push	EDX
>> push	ESI
>> call	near ptr _memcpy
>> mov	ECX,03Ch[ESP]
>> mov	8[ECX],EBX
>> mov	0Ch[ECX],EBX
>> mov	010h[ECX],EBX
>> mov	014h[ECX],EBX
>> mov	018h[ECX],EBX
>> mov	01Ch[ECX],EBX
>> inc	EBX
>> add	ESP,0Ch
>> cmp	EBX,05F5E100h
>> jb	L1A
>>
>>
>> (The memcpy is done by emplace to initialize the object before 
calling
>> its ctor. You must perform the initialization because it needs the
>> pointer to the virtual table and monitor. The monitor here was 
null. I
>> think a future LDC2 can optimize away more stuff in that loop, so 
it's
>> not so bad).
>>
>>
>> If you use this in program #1:
>> scope Test testObject = new Test(i, i, i, i, i, i);
>> It runs in about 6 seconds (also because the ctor is called even 
if's
>> missing).
>>
>> If in program #1 you use just new, without scope, the runtime is 
about
>> 27.2 seconds, about 110 times slower than Java.
>>
>> Bye,
>> bearophile
> 
> Takes 18m27.720s in PHP :)

Takes 5m26.776s in Python.
Takes 0m1.008s in Java.

can't test D version I don't have emplace and dsource is ignoring me.


More information about the Digitalmars-d mailing list