float equality

Sun Feb 20 11:08:56 PST 2011

== Quote from Jonathan M Davis (jmdavisProg at gmx.com)'s article
...
> It may be that you would still end up with situations where two values that you
> would think would be the same aren't due to rounding error or whatnot. However,
> with a fixed point value, you wouldn't have the problem where a particular value
> could not be held in it even if it's within its range of precision. As I
> understand it, there are a number of values which cannot be held in a floating
> point and therefore end up being rounded up or down simply because of how
> floating points work and not because there the precision isn't high enough.
>
> It's definitely true however, that using fractions would be much more accurate
> for a lot of stuff. That wouldn't be particulary efficient though. Still, if you're
> doing a lot of math that needs to be accurate, that may be the way to go.
>
> - Jonathan M Davis

Floating point numbers are a good compromise for a lot of purposes, but yeah they
are limited.  Here are some ways to open up the limits a bit...

(First some math for those who haven't gone down this path...)

The reason that fixed point does better than floating is that fixed point classes
use base ten and floating point uses base 2.  In base two, all fractions that have
a denominator of a power of 2 can be represented (e.g. 3/16) exactly, and all others
can't.

Fixed point solves this for numbers like .1 or .0003 because the denominator is a
power of ten.  Ten is 5*2 and the way it works is that any denominator that is a power
of two times a power of ten can be represented exactly.

So you're good for .01 or .039028 because these are (1/100) and (39028/1000000).  But
neither fixed point nor floating can represent something like 1/3 or 1/11 exactly.

If you used base 3, then 1/3 would be perfectly representable but 1/2 would not be.
I think this is why the ancients used 360 degrees in a circle... you can represent
a lot of common fractions as parts of 360, including 1/2, 1/3, 1/4, 1/5, 1/6, 1/8.
But still not 1/7 or 1/11.

How to solve this problem?

You can define a type like this:

struct Rational {
    // represents numerator over denominator
    ... (various arithmetic methods and operators)...

    long numerator;
    long denominator;
}

(Lots of people have written classes like this, maybe even for D. ;)

If you've taken high school math and CS 101 you can write the methods here, and then
you can represent any fraction.  This is a better way to go for a lot of problems if
you need exactness, but its a bit slower.

This isn't perfect though ... if you try to add the numbers 1/2, 1/3.... 1/1000000
you will eventually find the denominator overflows -- you can check for this and
'reduce' the fraction (remove common divisors from top and bottom) periodically,
but eventually you will even get an irreducable fraction and you will need, again,
to compromise and give away some precision.  The simplest way to do this compromise
is probably to divide the top and bottom number by 2, try to reduce again, and then
try the operation again (repeat until the operation doesn't overflow).  (Maybe there
is a better way though.)

For some problems the precision loss will never happen -- for example if you are
adding dollars and cents represented as (x/100) then the denominator never needs
to get bigger than 100.  If you need to compute mortgage payments it might get a
little bigger... but I'm not sure if it will ever overflow for this case.

Another source of compromise is when a number is really big or small, for example
a number like 2^200 or 1/2^200 can be represented in a 'double' but would overflow
a Rational as defined above.

You can delay the compromise a bit by using a struct like this:

struct Rational2 {
    // represents the number numerator / (denominator ^ exponent)
    // adjust integer types to taste
    long numerator;
    long denominator;
    long exponent;
};

This lets you go much further, and represent truly astonishingly large and small
numbers.  It complicates the math in the arithmetic operators a bit though.
Eventually, even this version will lose precision as the fraction still becomes
irreducable if there are a unique set of prime factors in the denominator.  So
this can represent much larger ranges than double or real, but I think it still
can't represent 1/2*1/3*....1/1000_000_000 exactly.

Now... what if you wanted to avoid the compromise completely?

You could switch to this:

struct {
    BigInt numerator;
    BigInt denominator;
};

Bingo -- no compromise.  Two down sides to this:

1. You trade off memory for accuracy as the BigInts grow over time, assuming
the fraction is irreducable.

2. It's slower -- BigInt objects are essentially byte strings and even things
like addition require a function call and looping.

This is a pretty flexible solution though.  However, numbers like pi and 'e'
can't be represented as a rational number at all, so you are still stuck if
you want to do logarithms or geometry.  And since all the rational solutions
will fail to represent PI just as floating point one does, you probably end
up using floating point since it does the operations much much quicker.

If you think about it, though, you can write a number like this:

1.123456< encyclopedia brittanica represented as decimals >123123

Since you can store unlimited amounts of information in a 'real' number,
you can never define a fixed size representation that will store any number.

Practically, it all comes down to what you need to represent.  If you can
quantify how much accuracy and what kinds of accuracy you need, then you
can probably build a system to represent that data.

Kevin