Help with Template Code
Max Samukha
samukha at voliacable.com
Sun Apr 1 15:21:19 PDT 2007
On Sun, 01 Apr 2007 22:19:03 +0200, Frits van Bommel
<fvbommel at REMwOVExCAPSs.nl> wrote:
>Jarrett Billingsley wrote:
>> "Max Samukha" <samukha at voliacable.com> wrote in message
>> news:nmmv03h5g5mtbkn6hetbnu33ei6nnnhd67 at 4ax.com...
>>> I thought it should, too. But when tested on Windows with dmd 1.010,
>>> the tuple version is significantly slower. I'm still not sure why.
>>
>> Ahh, looking at the disassembly it makes sense now. What happens is that
>> when you write:
>>
>> foreach(i, arg; args)
>> t.tupleof[i] = arg;
>>
>> It gets turned into something like _this_:
>>
>> typeof(args[0]) arg0 = args[0];
>> t.tupleof[0] = arg0;
>> typeof(args[1]) arg1 = args[1];
>> t.tupleof[1] = arg1;
>> typeof(args[2]) arg2 = args[2];
>> t.tupleof[2] = arg2;
>>
>> Notice it copies the argument value into a temp variable, then that temp
>> variable into the struct. Very inefficient.
>>
>> Unfortunately I don't know of any way to get around this..
>
>Yes, DMD does that, *unless you turn on optimizations* ;).
>Measuring performance without optimization switches is pretty much useless.
>
>With optimizations it just moves mem->reg, reg->mem. It generates code
>bit-for-bit identical to:
>---
> static S opCall(int x_, float y_, char[] z_) {
> S s = void;
> s.x = x_;
> s.y = y_;
> s.z = z_;
> return s;
> }
>---
>for the version Max posted (with =void)
>
>(The only difference is the mangled name; the mixin name is in there for
>the mixed-in version)
When compiling on Win XP with dmd 1.010 using -O -inline -release, the
time difference is more than 40%. The source is this:
import std.stdio;
import std.c.time;
template StructCtor()
{
static typeof(*this) opCall(typeof(typeof(*this).tupleof) args)
{
typeof(*this) t = void;
foreach(i, arg; args)
t.tupleof[i] = arg;
return t;
}
}
struct Bar
{
int x;
int y;
int z;
//mixin StructCtor;
static Bar opCall(int x, int y, int z)
{
Bar result = void;
result.x = x;
result.y = y;
result.z = z;
return result;
}
}
void main()
{
auto c = clock();
for (int i = 0; i < 100000000; i++)
{
auto test = Bar(i, i, i);
}
writefln(clock() - c);
}
What am I doing wrong?
More information about the Digitalmars-d-learn
mailing list