DMD 1.001 release is broken
Frits van Bommel
fvbommel at REMwOVExCAPSs.nl
Fri Jan 26 02:57:50 PST 2007
Walter Bright wrote:
> Frits van Bommel wrote:
>> Here's one that fails (tested on DMD v1.00, v1.002):
>>
>> -----
[snip]
>> -----
>>
>> The problem here is return value/parameter aliasing. It initializes
>> the return value before even looking at the parameter...
>
> I believe this fails with C++, too. But I don't think there's any
> solution to it but completely disable NRVO. There's no reasonable way
> for the compiler to detect that the d's are aliased. But NRVO is too
Maybe not detect, but it can disable NRVO if there's an inout parameter
with the same (or in case of Objects, related) type as the return value.
Not sure if that's a good idea though, since it may not be needed in the
majority of cases.
[later] Maybe it could just automatically create the temp variable if
the place to put the return value is also passed by reference to the
function? Then you'd still get the benefit of NRVO for calls that don't
do that and you get the benefit of working code for calls that do ;).
Of course, that only solves _that_ particular aliasing problem...
> valuable an optimization. So instead I'll argue that this is akin to:
> i = i++;
> i.e. it's an order-of-evaluation issue that should be avoided.
I disagree. Function calls are a well-established operation to serialize
operations (C++'s sequence points). Conceptually, the assignment happens
after the return and optimizations should not affect that.
IMHO, at least. (And it would seem "In G++'s Humble Opinion" as well,
see below)
> NRVO aliasing can be avoided by using a temporary:
> Data ret2 = ret;
> return ret2;
> instead of:
> return ret;
> Passing d by value instead of by ref will do the same thing. I assume
> that the reason inout is used here is to pass by reference to gain
> efficiency. Disabling NRVO will lose more than every bit of efficiency
> gained by passing by reference - so you might as well not pass it by
> reference, but by value. (NRVO eliminates two copies of the struct, not
> one. Replacing inout with in adds one copy. So NRVO is still one copy
> ahead <g>.)
As I mentioned above, I believe this temporary could be created
automatically when necessary.
Now, about your statement:
> I believe this fails with C++, too.
Not with g++[1].
I even tried calling the D function from C++ (since the g++-compiled
translation happens to read both values before writing to them) and
*that* worked fine, so it would seem g++ fixes it in the caller:
-----
urxae at urxae:~/tmp$ cat callfrob.cpp
#include <iostream>
#include <cassert>
struct Data
{
int x, y;
/// To make size > 8 so NRVO is used.
/// Program runs correctly with this line commented out:
char filler;
};
extern "C" Data frob(Data& d);
extern"C" void callfrob() {
using namespace std;
Data d; d.x = 1; d.y = 2;
d = frob(d);
cout << "C++:" << endl << d.x << endl << d.y << endl;
assert(d.x == 3 && d.y == -1);
}
urxae at urxae:~/tmp$ g++ -c callfrob.cpp -o callfrob.o
urxae at urxae:~/tmp$ cat frob.d
import std.stdio;
struct Data
{
int x, y;
/// To make size > 8 so NRVO is used.
/// Program runs correctly with this line commented out:
byte filler;
}
extern(C) Data frob(inout Data d)
{
Data ret;
ret.y = d.x - d.y;
ret.x = d.x + d.y;
return ret;
}
extern(C) void callfrob();
void main() {
Data d; d.x = 1; d.y = 2;
d = frob(d);
writefln("D:");
writefln(d.x);
writefln(d.y);
callfrob();
}
urxae at urxae:~/tmp$ dmd frob.d callfrob.o -L-lstdc++ -offrob
gcc frob.o callfrob.o -o frob -m32 -lphobos -lpthread -lm -Xlinker
-lstdc++ -Xlinker -L/home/urxae/opt/dmd/lib
urxae at urxae:~/tmp$ ./frob
D:
0
0
C++:
3
-1
-----
The relevant part of callfrob.o's 'objdump -Sr':
-----
7a: c7 45 ec 01 00 00 00 movl $0x1,0xffffffec(%ebp)
81: c7 45 f0 02 00 00 00 movl $0x2,0xfffffff0(%ebp)
88: 8d 55 d8 lea 0xffffffd8(%ebp),%edx
8b: 8d 45 ec lea 0xffffffec(%ebp),%eax
8e: 89 44 24 04 mov %eax,0x4(%esp)
92: 89 14 24 mov %edx,(%esp)
95: e8 fc ff ff ff call 96 <callfrob+0x24>
96: R_386_PC32 frob
9a: 83 ec 04 sub $0x4,%esp
9d: 8b 45 d8 mov 0xffffffd8(%ebp),%eax
a0: 89 45 ec mov %eax,0xffffffec(%ebp)
a3: 8b 45 dc mov 0xffffffdc(%ebp),%eax
a6: 89 45 f0 mov %eax,0xfffffff0(%ebp)
a9: 8b 45 e0 mov 0xffffffe0(%ebp),%eax
ac: 89 45 f4 mov %eax,0xfffffff4(%ebp)
-----
So it would seem g++ in this situation indeed creates a temporary
"variable" on the stack, uses it for the return value, and then copies
it over the actual variable.
Presumably it does this for a reason. Either it's required by the C++
standard or they thought it'd be polite to DWIM in this situation...
And no, it doesn't do that if I change the line with the call to 'Data e
= frob(d)' (and the output to show e.x & e.y):
-----
7a: c7 45 ec 01 00 00 00 movl $0x1,0xffffffec(%ebp)
81: c7 45 f0 02 00 00 00 movl $0x2,0xfffffff0(%ebp)
88: 8d 55 e0 lea 0xffffffe0(%ebp),%edx
8b: 8d 45 ec lea 0xffffffec(%ebp),%eax
8e: 89 44 24 04 mov %eax,0x4(%esp)
92: 89 14 24 mov %edx,(%esp)
95: e8 fc ff ff ff call 96 <callfrob+0x24>
96: R_386_PC32 frob
9a: 83 ec 04 sub $0x4,%esp
9d: 8b 75 e4 mov 0xffffffe4(%ebp),%esi
a0: 8b 5d e0 mov 0xffffffe0(%ebp),%ebx
-----
Those last two instructions were included to show the absence of the
copy instructions, the first disassembly had similar ones:
-----
af: 8b 75 f0 mov 0xfffffff0(%ebp),%esi
b2: 8b 5d ec mov 0xffffffec(%ebp),%ebx
-----
(Presumably, they're preparations for the output calls)
And by the way, I also compiled the C++ with -O2 and with -O3. It still
performed the copy, just more efficiently ;):
-----
93: e8 fc ff ff ff call 94 <callfrob+0x24>
94: R_386_PC32 frob
98: 8b 5d d8 mov 0xffffffd8(%ebp),%ebx
9b: 8b 75 dc mov 0xffffffdc(%ebp),%esi
9e: 8b 45 e0 mov 0xffffffe0(%ebp),%eax
a1: 89 5d ec mov %ebx,0xffffffec(%ebp)
a4: 89 75 f0 mov %esi,0xfffffff0(%ebp)
a7: 83 ec 04 sub $0x4,%esp
aa: 89 45 f4 mov %eax,0xfffffff4(%ebp)
-----
Versions used to test:
-----
urxae at urxae:~/tmp$ g++ --version | head -n 1
g++ (GCC) 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)
urxae at urxae:~/tmp$ dmd | head -n 1
Digital Mars D Compiler v1.002
-----
[1]: g++ is the only C++ compiler I have. If it doesn't work in DMC, I
humbly suggest you fix that one too :P.
More information about the Digitalmars-d-announce
mailing list