Always false float comparisons

Wed May 18 04:16:44 PDT 2016

On Wednesday, 18 May 2016 at 09:21:30 UTC, Ola Fosheim Grøstad 
wrote:
> On Wednesday, 18 May 2016 at 07:21:30 UTC, Joakim wrote:
>> On Wednesday, 18 May 2016 at 05:49:16 UTC, Ola Fosheim Grøstad 
>> wrote:
>>> On Wednesday, 18 May 2016 at 03:01:14 UTC, Joakim wrote:
>>>> There is nothing "random" about increasing precision till 
>>>> the end, it follows a well-defined rule.
>>>
>>> Can you please quote that well-defined rule?
>>
>> It appears to be "the compiler carries everything internally 
>> to 80 bit precision, even if they are typed as some other 
>> precision."
>> http://forum.dlang.org/post/nh59nt$1097$1@digitalmars.com
>
> "The compiler" means: implementation defined. That is the same 
> as not being well-defined. :-)

Welcome to the wonderful world of C++! :D

More seriously, it is well-defined for that implementation, you 
did not raise the issue of the spec till now.  In fact, you 
seemed not to care what the specs say.

>> I don't understand why you're using const for one block and 
>> not the other, seems like a contrived example.  If the 
>> precision of such constants matters so much, I'd be careful to 
>> use the same const float everywhere.
>
> Now, that is a contrived defense for brittle language 
> semantics! :-)

No, it has nothing to do with language semantics and everything 
to do with bad numerical programming.

>> If matching such small deltas matters so much, I wouldn't be 
>> using floating-point in the first place.
>
> Why not? The hardware gives the same delta. It only goes wrong 
> if the compiler decides to "improve".

Because floating-point is itself fuzzy, in so many different 
ways.  You are depending on exactly repeatable results with a 
numerical type that wasn't meant for it.

>>> It depends on the unit tests running with the exact same 
>>> precision as the production code.
>>
>> What makes you think they don't?
>
> Because the language says that I cannot rely on it and the 
> compiler implementation proves that to be correct.

You keep saying this: where did anyone mention unit tests not 
running with the same precision till you just brought it up out 
of nowhere?  The only prior mention was that compile-time 
calculation of constants that are then checked for bit-exact 
equality in the tests might have problems, but that's certainly 
not all tests and I've repeatedly pointed out you should never be 
checking for bit-exact equality.

>>> D is doing it wrong because it makes it is thereby forcing 
>>> programmers to use algorithms that are 10-100x slower to get 
>>> reliable results.
>>>
>>> That is _wrong_.
>>
>> If programmers want to run their code 10-100x slower to get 
>> reliably inaccurate results, that is their problem.
>
> Huh?

The point is that what you consider reliable will be less 
accurate, sometimes much less.

>> If you're so convinced it's exact for a few cases, then check 
>> exact equality there.  For most calculation, you should be 
>> using approxEqual.
>
> I am sorry, but this is not a normative rule at all. The rule 
> is that you check for the bounds required. If it is exact, it 
> just means the bounds are the same value (e.g. tight).
>
> It does not help to say that people should use "approxEqual", 
> because it does not improve on correctness. Saying such things 
> just means that non-expert programmers assume that guessing the 
> bounds will be sufficient. Well, it isn't sufficient.

The point is that there are _always_ bounds, so you can never 
check for the same value.  Almost any guessed bounds will be 
better than incorrectly checking for the bit-exact value.

>> Since the real error bound is always larger than that, almost 
>> any error bound you pick will tend to be closer to the real 
>> error bound, or at least usually bigger and therefore more 
>> realistic, than checking for exact equality.
>
> I disagree. It is much better to get extremely wrong results 
> frequently and therefore detect the error in testing.
>
> What you are saying is that is better to get extremely wrong 
> results infrequently which usually leads to error passing 
> testing and enter production.
>
> In order to test well you also need to understand for input 
> makes the algorithm unstable/fragile.

Nobody is talking about the general principle of how often you 
get wrong results or unit testing.  We were talking about a very 
specific situation: how should compile-time constants be checked 
and variables compared to constants, compile-time or not, to 
avoid exceptional situations.  My point is that both should 
always be thought about.  In the latter case, ie your f(x) 
example, it has nothing to do with error bounds, but that your 
f(x) is not only invalid at 2, but in a range around 2.

Now, both will lead to less "wrong results," but those are wrong 
results you _should_ be trying to avoid as early as possible.

>> The computer doesn't know that, so it will just plug that x in 
>> and keep cranking, till you get nonsense data out the end, if 
>> you don't tell it to check that x isn't too close to 2 and not 
>> just 2.
>
> Huh? I am not getting nonsense data. I am getting what I am 
> asking for, I only want to avoid dividing by zero because it 
> will make the given hardware 100x slower than the test.

Zero is not the only number that screws up that calculation.

>> You have a wrong mental model that the math formulas are the 
>> "real world," and that the computer is mucking it up.
>
> Nothing wrong with my mental model. My mental model is the 
> hardware specification + the specifics of the programming 
> platform. That is the _only_ model that matters.
>
> What D prevents me from getting is the specifics of the 
> programming platform by making the specifics hidden.

Your mental model determines what you think is valid input to 
f(x) and what isn't, that has nothing to do with D.  You want D 
to provide you a way to only check for 0.0, whereas my point is 
that there are many numbers in the neighborhood of 0.0 which will 
screw up your calculation, so really you should be using 
approxEqual.

>> The truth is that the computer, with its finite maximums and 
>> bounded precision, better models _the measurements we make to 
>> estimate the real world_ than any math ever written.
>
> I am not estimating anything. I am synthesising artificial 
> worlds. My code is the model, the world is my code running at 
> specific hardware.
>
> It is self contained. I don't want the compiler to change my 
> model because that will generate the wrong world. ;-)

It isn't changing your model, you can always use a very small 
threshold in approxEqual. Yes, a few more values would be 
disallowed as input and output than if you were to compare 
exactly to 0.0, but your model is almost certainly undefined 
there too.

If your point is that you're modeling artificial worlds that have 
nothing to do with reality, you can always change your threshold 
around 0.0 to be much smaller, and who cares if it can't go all 
the way to zero, it's all artificial, right? :) If you're 
modeling the real world, any function that blows up and gives you 
bad data, blows up over a range, never a single point, because 
that's how measurement works.

>>>> Oh, it's real world alright, you should be avoiding more 
>>>> than just 2 in your example above.
>>>
>>> Which number would that be?
>>
>> I told you, any numbers too close to 2.
>
> All numbers close to 2 in the same precision will work out ok.

They will give you large numbers that can be represented in the 
computer, but do not work out to describe the real world, because 
such formulas are really invalid in a neighborhood of 2, not just 
at 2.

>> On the contrary, it is done because 80-bit is faster and more 
>> precise, whereas your notion of reliable depends on an 
>> incorrect notion that repeated bit-exact results are better.
>
> 80 bit is much slower. 80 bit mul takes 3 micro ops, 64 bit 
> takes 1. Without SIMD 64 bit is at least twice as fast. With 
> SIMD multiply-add is maybe 10x faster in 64bit.

I have not measured this speed myself so I can't say.

> And it is neither more precise or more accurate when you don't 
> get consistent precision.
>
> In the real world you can get very good performance for the 
> desired accuracy by using unstable algorithms by adding a stage 
> that compensate for the instability. That does not mean that it 
> is acceptable to have differences in the bias as that can lead 
> to accumulating an offset that brings the result away from zero 
> (thus a loss of precision).

A lot of hand-waving about how more precision is worse, with no 
real example, which is what Walter keeps asking for.

>> You noted that you don't care that the C++ spec says similar 
>> things, so I don't see why you care so much about the D spec 
>> now.
>>  As for that scenario, nobody has suggested it.
>
> I care about what the C++ spec. I care about how the platform 
> interprets the spec. I never rely on _ONLY_ the C++ spec for 
> production code.

Then you must be perfectly comfortable with a D spec that says 
similar things. ;)

> You have said previously that you know the ARM platform. On 
> Apple CPUs you have 3 different floating point units: 32 bit 
> NEON, 64 bit NEON and 64 bit IEEE.
>
> It supports 1x64bit IEEE, 2x64bit NEON and 4x32 bit NEON.
>
> You have to know the language, the compiler and the hardware to 
> make this work out.

Sure, but nobody has suggested interchanging the three randomly.

>>> And so is "float" behaving differently than "const float".
>>
>> I don't believe it does.
>
> I have proven that it does, and posted it in this thread.

I don't think that example has much to do with what we're talking 
about.  It appears to be some sort of constant folding in the 
assert that produces the different results, as Joe says, which 
goes away if you use approxEqual.  If you look at the actual 
const float initially, it is very much a float, contrary to your 
assertions.