Question/request/bug(?) re. floating-point in dmd
qznc
qznc at web.de
Wed Oct 23 23:12:01 PDT 2013
On Wednesday, 23 October 2013 at 15:44:54 UTC, Apollo Hogan wrote:
> For example, the appended code produces the following output
> when compiled (DMD32 D Compiler v2.063.2 under WinXP/cygwin)
> with no optimization:
>
> immutable(pair)(1.1, -2.03288e-20)
> pair(1, 0.1)
> pair(1.1, -8.32667e-17)
>
> and the following results when compiled with optimization (-O):
>
> immutable(pair)(1.1, -2.03288e-20)
> pair(1, 0.1)
> pair(1.1, 0)
>
> The desired result would be:
>
> immutable(pair)(1.1, -8.32667e-17)
> pair(1, 0.1)
> pair(1.1, -8.32667e-17)
>
> Cheers,
> --Apollo
>
> import std.stdio;
> struct pair { double hi, lo; }
> pair normalize(pair q)
> {
> double h = q.hi + q.lo;
> double l = q.lo + (q.hi - h);
> return pair(h,l);
> }
> void main()
> {
> immutable static pair spn = normalize(pair(1.0,0.1));
> writeln(spn);
> writeln(pair(1.0,0.1));
> writeln(normalize(pair(1.0,0.1)));
> }
I can replicate it here. Here is an objdump diff of normalize:
Optimized:
| Unoptimized:
08076bdc <_D6fptest9normalizeFS6fptest4pairZS6fptest4pair>:
08076bdc <_D6fptest9normalizeFS6fptest4pairZS6fptest4pair>:
8076bdc: 55 push %ebp
8076bdc: 55 push %ebp
8076bdd: 8b ec mov %esp,%ebp
8076bdd: 8b ec mov
%esp,%ebp
8076bdf: 83 ec 10 sub $0x10,%esp
| 8076bdf: 83 ec 14 sub
$0x14,%esp
8076be2: dd 45 08 fldl 0x8(%ebp)
8076be2: dd 45 08 fldl
0x8(%ebp)
8076be5: d9 c0 fld %st(0)
| 8076be5: dc 45 10 faddl
0x10(%ebp)
8076be7: 89 c1 mov %eax,%ecx
| 8076be8: dd 5d ec fstpl
-0x14(%ebp)
8076be9: dc 45 10 faddl 0x10(%ebp)
| 8076beb: dd 45 08 fldl
0x8(%ebp)
8076bec: dd 55 f0 fstl
-0x10(%ebp) | 8076bee: dc 65 ec
fsubl -0x14(%ebp)
8076bef: de e9 fsubrp %st,%st(1)
| 8076bf1: dc 45 10 faddl
0x10(%ebp)
8076bf1: dd 45 f0 fldl
-0x10(%ebp) | 8076bf4: dd 5d f4
fstpl -0xc(%ebp)
8076bf4: d9 c9 fxch %st(1)
| 8076bf7: dd 45 ec fldl
-0x14(%ebp)
8076bf6: dc 45 10 faddl 0x10(%ebp)
| 8076bfa: dd 18 fstpl (%eax)
8076bf9: dd 5d f8 fstpl -0x8(%ebp)
| 8076bfc: dd 45 f4 fldl
-0xc(%ebp)
8076bfc: dd 45 f8 fldl -0x8(%ebp)
| 8076bff: dd 58 08 fstpl
0x8(%eax)
8076bff: d9 c9 fxch %st(1)
| 8076c02: c9 leave
8076c01: dd 19 fstpl (%ecx)
| 8076c03: c2 10 00 ret $0x10
8076c03: dd 59 08 fstpl 0x8(%ecx)
| 8076c06: 90 nop
8076c06: 8b e5 mov %ebp,%esp
| 8076c07: 90 nop
8076c08: 5d pop %ebp
| 8076c08: 90 nop
8076c09: c2 10 00 ret $0x10
| 8076c09: 90 nop
> 8076c0a: 90 nop
> 8076c0b: 90 nop
I cannot see any significant difference. The fadd-fsub-fadd
sequence seems to be the same in both cases.
More information about the Digitalmars-d
mailing list