Printing shortest decimal form of floating point number with Mir
9il
ilyayaroshenko at gmail.com
Tue Jan 5 07:22:52 UTC 2021
On Tuesday, 5 January 2021 at 03:20:16 UTC, Walter Bright wrote:
> On 1/4/2021 4:11 AM, 9il wrote:
>> [...]
> The reason those switches are provided is because the
> write/read is a performance hog.
>
> D provides a couple functions in druntime which guarantee
> rounding intermediate values to float/double precision. Those
> can be used as required. This is better than a compiler switch
> because having compiler switches that influence floating point
> results is poor design.
>
> > Since C99 the default x87 behavior is precise.
>
> Not entirely:
>
> float f(float a, float b) {
> float d = (a + b) - b;
> return d;
> }
>
> f:
> sub esp, 4
> fld DWORD PTR [esp+12]
> fld st(0)
> fadd DWORD PTR [esp+8]
> [no write/read to memory here, so no round to float]
> fsubrp st(1), st
> fstp DWORD PTR [esp]
> fld DWORD PTR [esp]
> add esp, 4
> ret
>
> In any case, let's try your example
> https://cpp.godbolt.org/z/7sa8dP with dmd for 32 bits:
>
> push EAX
> push EAX
> fld float ptr 010h[ESP]
> fadd float ptr 0Ch[ESP]
> fstp float ptr [ESP] // there's the write
> fld float ptr [ESP] // there's the read!
> fsub float ptr 0Ch[ESP]
> fstp float ptr 4[ESP] // the write
> fld float ptr 4[ESP] // the read
> add ESP,8
> ret 8
>
> It's semantically equivalent to the godbolt asm you posted.
I can't reproduce the same DMD output as you.
DMD with flags -m32 -O generates
https://cpp.godbolt.org/z/9b4e9K
assume CS:.text._D7example1fFffZf
push EBP
mov EBP,ESP
fld float ptr 0Ch[ESP]
fadd float ptr 8[EBP]
fsub float ptr 8[EBP]
pop EBP
ret 8
add [EAX],AL
add [EAX],AL
As you can see there are no write-read op codes.
DMD with flag -m32 generates
https://cpp.godbolt.org/z/GMGMra
assume CS:.text._D7example1fFffZf
push EBP
mov EBP,ESP
sub ESP,018h
movss XMM0,0Ch[EBP]
movss XMM1,8[EBP]
addss XMM0,XMM1
movss -8[EBP],XMM0
subss XMM0,XMM1
movss -4[EBP],XMM0
movss -018h[EBP],XMM0
fld float ptr -018h[EBP]
leave
ret 8
add [EAX],AL
It just uses SSE, which I think a good way to go, haha. Probably
if no one has raised this bug then all real-world DMD targets
have at least SSE support.
The only D compiler that uses excess precision is DMD and only if
-O flag is passed. The same example compiled with GDC uses
write-read codes. LDC uses SSE codes.
As for C, it allows an intuitive built-in way to work with exact
precision when an assignment works like a directive to use exact
precision for the expression result, unlike D. It doesn't cover
all cases but an intuitive and very easy way to do things the
right way.
More information about the Digitalmars-d-announce
mailing list