Always false float comparisons

Fri May 20 02:14:46 PDT 2016

On Thursday, 19 May 2016 at 18:22:48 UTC, Timon Gehr wrote:
> On 19.05.2016 08:04, Joakim wrote:
>> On Wednesday, 18 May 2016 at 17:10:25 UTC, Timon Gehr wrote:
>>> It's not just slightly worse, it can cut the number of useful 
>>> bits in
>>> half or more! It is not unusual, I have actually run into 
>>> those
>>> problems in the past, and it can break an algorithm that is 
>>> in Phobos
>>> today!
>>
>> I wouldn't call that broken.  Looking at the hex output by 
>> replacing %f
>> with %A in writefln, it appears the only differences in all 
>> those
>> results is the last byte in the significand.
>
> Argh...
>
> // ...
>
> void main(){
>     //double[] data=[1e16,1,-9e15];
>     import std.range;
>     double[] data=1e16~repeat(1.0,100000000).array~(-9e15);
>     import std.stdio;
>     writefln("%f",sum(data)); // baseline
>     writefln("%f",kahan(data)); // kahan
>     writefln("%f",kahanBroken(data)); // broken kahan
> }
>
>
> dmd -run kahanDemo.d
> 1000000000000000.000000
> 1000000100000000.000000
> 1000000000000000.000000
>
> dmd -m32 -O -run kahanDemo.d
> 1000000000000000.000000
> 1000000000000000.000000
> 1000000000000000.000000
>
>
> Better?
>
> Obviously there is more structure in the data that I invent 
> manually than in a real test case where it would go wrong. The 
> problems carry over though.

I looked over your code a bit.  If I define sum and c as reals in 
"kahanBroken" at runtime, this problem goes away.  Since that's 
what the CTFE rule is actually doing, ie extending all 
floating-point to reals at compile-time, I don't see what you're 
complaining about.  Try it, run even your original naive 
summation algorithm through CTFE and it will produce the result 
you want:

enum double[] ctData=[1e16,1,-9e15];
enum ctSum = sum(ctData);
writefln("%f", ctSum);

>> As Don's talk pointed out,
>> all floating-point calculations will see loss of precision 
>> starting there.
>> ...
>
>
> This is implicitly assuming a development model where the 
> programmer first writes down the computation as it would be 
> correct in the real number system and then naively replaces 
> every operation by the rounding equivalent and hopes for the 
> best.

No, it is intrinsic to any floating-point calculation.

> It is a useful rule if that is what you're doing. One might be 
> doing something else. Consider the following paper for an 
> example where the last bit in the significant actually carries 
> useful information for many of the values used in the program.
>
> http://www.jaist.ac.jp/~s1410018/papers/qd.pdf

Did you link to the wrong paper? ;) I skimmed it and that paper 
explicitly talks about error bounds all over the place.  The only 
mention of "the last bit" is when they say they calculated their 
constants in arbitrary precision before rounding them for runtime 
use, which is ironically similar to what Walter suggested doing 
for D's CTFE also.

>> In this case, not increasing precision gets the more accurate 
>> result,
>> but other examples could be constructed that _heavily_ favor 
>> increasing
>> precision.
>
> Sure. In such cases, you should use higher precision. What is 
> the problem? This is already supported (the compiler is not 
> allowed to use lower precision than requested).

I'm not the one with the problem, you're the one complaining.

>> In fact, almost any real-world, non-toy calculation would
>> favor it.
>>
>> In any case, nobody should depend on the precision out that 
>> far being
>> accurate or "reliable."
>
>
> IEEE floating point has well-defined behaviour and there is 
> absolutely nothing wrong with code that delivers more accurate 
> results just because it is actually aware of the actual 
> semantics of the operations being carried out.

You just made the case for Walter doing what he did. :)