Always false float comparisons

Wed May 18 00:21:30 PDT 2016

On Wednesday, 18 May 2016 at 05:49:16 UTC, Ola Fosheim Grøstad 
wrote:
> On Wednesday, 18 May 2016 at 03:01:14 UTC, Joakim wrote:
>> There is nothing "random" about increasing precision till the 
>> end, it follows a well-defined rule.
>
> Can you please quote that well-defined rule?

It appears to be "the compiler carries everything internally to 
80 bit precision, even if they are typed as some other precision."
http://forum.dlang.org/post/nh59nt$1097$1@digitalmars.com

> It is indeed random, or arbitrary (which is the same thing):

No, they're not the same thing: rules can be arbitrarily set yet 
consistent over time, whereas random usually means both arbitrary 
and inconsistent over time.

> if(x<0){
>   // DMD choose 64 bit mantissa
>   const float y = ...
>   ...
>
> } else {
>   // DMD choose 24 bit mantissa
>   float y = ...
>   ...
> }
>
> How is this not arbitrary?

I believe that means any calculation used to compute y at 
compile-time will be done in 80-bit or larger reals, then rounded 
to a const float for run-time, so your code comments would be 
wrong.

> If x is the amplitude then a flaw like this can cause a DC 
> offset to accumulate and you end up with reduced precision, not 
> improved precision. A DC filter at the end does not help on 
> this precision loss.

I don't understand why you're using const for one block and not 
the other, seems like a contrived example.  If the precision of 
such constants matters so much, I'd be careful to use the same 
const float everywhere.

>> So you're planning on running phase-locking code partially in 
>> CTFE and runtime and it's somehow very sensitive to precision?
>>  If your "phase-locking" depends on producing bit-exact 
>> results with floating-point, you're doing it wrong.
>
> I am not doing anything wrong. D is doing it wrong. If you add 
> different deltas then you will get drift. So, no improved 
> precision in calculating some deltas is not improving the 
> accuracy. It makes it worse.

If matching such small deltas matters so much, I wouldn't be 
using floating-point in the first place.

>> If any of this depends on comparing bit-exact floating-point 
>> results, you're doing it wrong.
>
> It depends on the unit tests running with the exact same 
> precision as the production code.

What makes you think they don't?

> Fast floating point code depends on the specifics of the 
> hardware. A system level language should not introduce a 
> different kind of bias that isn't present in the hardware!

He's doing this to take advantage of the hardware, not the 
opposite!

> D is doing it wrong because it makes it is thereby forcing 
> programmers to use algorithms that are 10-100x slower to get 
> reliable results.
>
> That is _wrong_.

If programmers want to run their code 10-100x slower to get 
reliably inaccurate results, that is their problem.

>> If the constant is calculated rather than a literal, you 
>> should be checking using approxEqual.
>
> No. 1+2+3+4 is exact on all floating point units I know of.

If you're so convinced it's exact for a few cases, then check 
exact equality there.  For most calculation, you should be using 
approxEqual.

> Btw, have you ever tried to prove error bounds for an iterative 
> method?
> You actually think most people prove them to be correct?
>
> Or perhaps almost all of them just pick a number out of thin 
> air which they think will work out and rely on testing their 
> code?

No, I have never done so, and I'm well aware that most just pull 
the error bound they use out of thin air.

> Well, the latter is no better than checking for exact equality. 
> And I can assure you that the vast majority of programmers do 
> not prove error bounds with the level of rigour it takes to get 
> it correct.

Even an unproven error bound is better than "exact" equality, 
which technically is really assuming that the error bound is 
smaller than the highest precision you can specify the number.  
Since the real error bound is always larger than that, almost any 
error bound you pick will tend to be closer to the real error 
bound, or at least usually bigger and therefore more realistic, 
than checking for exact equality.

> The reality is that it is common practice to write code that 
> seems to work. But that does not make it correct. However, 
> making it correct is way too time consuming and often not worth 
> the trouble. So people rely on testing. Floating point code is 
> no different.
>
> But with D semantics you cannot rely on testing. That's bad, 
> because most people write incorrect code. Whether they are 
> experts or not. (it is only matter of difference in frequency)

You can still test with approxEqual, so I don't understand why 
you think that's not testing.  std.math uses feqrel, approxEqual, 
and equalsDigit and other such lower-precision checks extensively 
in its tests.

>>> f(x) = 1/(2-x)
>>>
>>> Should I not be able to test for the exact value "2" here?
>>
>> It would make more sense to figure out what the max value of 
>> f(x) is you're trying to avoid, say 1e6, and then check for 
>> approxEqual(x, 2, 2e-6).  That would make much more sense than 
>> only avoiding 2, when an x that is arbitrarily close to 2 can 
>> also blow up f(x).
>
> I am trying to avoid an exception, not a value.

That is the problem.  In the real world, all such formulas are 
approximations that only apply over a certain range.  If you were 
cranking that out by hand and some other calculation gave you an 
x really close to 2, say 2.0000035, you'd go back and check your 
math, as f(x) would blow up and give you unrealistically large 
numbers for the rest of your calculations.

The computer doesn't know that, so it will just plug that x in 
and keep cranking, till you get nonsense data out the end, if you 
don't tell it to check that x isn't too close to 2 and not just 2.

You have a wrong mental model that the math formulas are the 
"real world," and that the computer is mucking it up.  The truth 
is that the computer, with its finite maximums and bounded 
precision, better models _the measurements we make to estimate 
the real world_ than any math ever written.

>> Oh, it's real world alright, you should be avoiding more than 
>> just 2 in your example above.
>
> Which number would that be?

I told you, any numbers too close to 2.

>> Simply repeating the word "random" over and over again does 
>> not make it so.
>
> That's right. It is DMD that makes it so, not my words.
>
> However, in order to reject what other say, you have to make an 
> argument. And in this case we have:
>
> 1. A system level programming language that claims to excel at 
> floating point.
> 2. Hardware platforms with specific behaviour.
>
> Unfortunately 2 is true, but not 1.
>
> D is not matching up to the minimum requirements for people 
> wanting to write fast and reliable floating point code for a 
> given platform.

On the contrary, it is done because 80-bit is faster and more 
precise, whereas your notion of reliable depends on an incorrect 
notion that repeated bit-exact results are better.

> According to the D spec, the compiler could schedule typed 
> single precision floating point calculations to two completely 
> different floating point units (and yes, there are platforms 
> that provide multiple incompatible floating point units with 
> widely differing characteristics).

You noted that you don't care that the C++ spec says similar 
things, so I don't see why you care so much about the D spec now. 
  As for that scenario, nobody has suggested it.

> That is random.

It may be arbitrary, but it is not random unless it's 
inconsistently done.

> And so is "float" behaving differently than "const float".

I don't believe it does.