approxEqual() has fooled me for a long time...

Wed Oct 20 23:05:58 PDT 2010

On Thu, 21 Oct 2010 00:19:11 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> On 10/20/10 16:33 CDT, Don wrote:
>> Walter Bright wrote:
>>> Andrei Alexandrescu wrote:
>>>> On 10/20/10 13:42 CDT, Walter Bright wrote:
>>>>> Don wrote:
>>>>>> I'm personally pretty upset about the existence of that function at
>>>>>> all.
>>>>>> My very first contribution to D was a function for floating point
>>>>>> approximate equality, which I called approxEqual.
>>>>>> It gives equality in terms of number of bits. It gives correct  
>>>>>> results
>>>>>> in all the tricky special cases. Unlike a naive relative equality  
>>>>>> test
>>>>>> involving divisions, it doesn't fail for values near zero. (I  
>>>>>> _think_
>>>>>> that's the reason why people think you need an absolute equality  
>>>>>> test
>>>>>> as well).
>>>>>> And it's fast. No divisions, no poorly predictable branches.
>>>>>
>>>>> I totally agree that a precision based on the number of bits, not the
>>>>> magnitude, is the right approach.
>>>>
>>>> I wonder, could that be also generalized for zero? I.e., if a number
>>>> is zero except for k bits in the mantissa.
>>>
>>> Zero is a special case I'm not sure how to deal with.
>>
>> It does generalize to zero.
>> Denormals have the first k bits in the mantissa set to zero. feqrel
>> automatically treats them as 'close to zero'. It just falls out of the
>> maths.
>>
>> BTW if the processor has a "flush denormals to zero" mode, denormals
>> will compare exactly equal to zero.
>
> So here's a plan of attack:
>
> 1. Keep feqrel. Clearly it's a useful primitive.

vote++

> 2. Find a more intuitive interface for feqrel, i.e. using decimal digits  
> for precision etc. The definition in std.math: "the number of mantissa  
> bits which are equal in x and y" loses 90% of the readership at the word  
> "mantissa". You want something intuitive that people can immediately  
> picture.
>
> 3. Define a good name for that

When I think of floating point precision, I automatically think in ULP  
(units in last place). It is how the IEEE 754 specification specifies the  
precision of the basic math operators, in part because a given ULP  
requirement is applicable to any floating point type. And the ability for  
ulp(x,y) <= 2 to be meaningful for floats, doubles and reals is great for  
templates/generic programming.

Essentially ulp(x,y) == min(x.mant_dig, y.mant_dig) - feqrel(x,y);

On this subject, I remember that immediately after learning about the "=="  
operator I was instructed to never, ever use it for floating point values  
unless I knew for a fact one value had to be a copy of another. This of  
course leads to bad programming habits like:
"In floating-point arithmetic, numbers, including many simple fractions,  
cannot be represented exactly, and it may be necessary to test for  
equality within a given tolerance. For example, rounding errors may mean  
that the comparison in
a = 1/7
if a*7 = 1 then ...
unexpectedly evaluates false. Typically this problem is handled by  
rewriting the comparison as if abs(a*7 - 1) < tolerance then ..., where  
tolerance is a suitably tiny number and abs is the absolute value  
function." - from Wikipedia's Comparison (computer programming) page.

Since D has the "is" operator, does it make sense to actually 'fix' "=="  
to be fuzzy? Or perhaps to make the set of extended floating point  
comparison operators fuzzy?(i.e. "!<>" <=>  ulp(x,y) <= 2) Or just add an  
approximately equal operator (i.e. "~~" or "=~") (since nobody will type  
the actual Unicode ≈) Perhaps even a tiered approach ("==" <=> ulp(x,y) <=  
1, "~~" <=> ulp(x,y) <= 8).