Handling big FP numbers

H. S. Teoh hsteoh at quickfur.ath.cx
Sat Feb 9 03:03:41 UTC 2019


On Sat, Feb 09, 2019 at 02:12:29AM +0000, Murilo via Digitalmars-d-learn wrote:
> Why is it that in C when I attribute the number
> 99999912343000007654329925.7865 to a double it prints
> 99999912342999999470108672.0000 and in D it prints
> 99999912342999999000000000.0000 ? Apparently both languages cause a
> certain loss of precision(which is part of converting the decimal
> system into the binary system) but why is it that the results are
> different?

It's not only because of converting decimal to binary, it's also because
double only has 64 bits to store information, and your number has far
more digits than can possibly fit into 64 bits.  Some number of bits are
used up for storing the sign and exponent, so `double` really can only
store approximately 15 decimal digits.  Anything beyond that simply
doesn't exist in a `double`, so attempting to print that many digits
will inevitably produce garbage trailing digits.  If you round the above
outputs to about 15 digits, you'll see that they are essentially equal.

As to why different output is produced, that's probably just an artifact
of different printing algorithms used to output the number.  There may
be a small amount of difference around or after the 15th digit because D
implicitly converts to `real` (which on x86 is the 80-bit proprietary
FPU representation) for some operations and rounds it back, while C
only operates on 64-bit double.  This may cause some slight difference
in behaviour around the 15th digit or so.

Either way, it is not possible for `double` to hold more than 15
decimal digits, and any output produced from the non-existent digits
following that is suspect and probably just random garbage.

If you want to hold more than 15 digits, you'll either have to use
`real`, which depending on your CPU will be 80-bit (x86) or 128-bit (a
few newer, less common CPUs), or an arbitrary-precision library that
simulates larger precisions in software, like the MPFR module of libgmp.
Note, however, that even even 80-bit real realistically only holds up to
about 18 digits, which isn't very much more than a double, and still far
too small for your number above.  You need at least a 128-bit quadruple
precision type (which can represent up to about 34 digits) in order to
represent your above number accurately.


T

-- 
The fact that anyone still uses AOL shows that even the presence of options doesn't stop some people from picking the pessimal one. - Mike Ellis


More information about the Digitalmars-d-learn mailing list