Sorting floating-point values, and NaN

Tue Nov 12 09:59:36 PST 2013

On Tuesday, 12 November 2013 at 16:48:01 UTC, Andrei Alexandrescu 
wrote:
> On 11/12/13 8:00 AM, Vladimir Panteleev wrote:
>> 1) NaN values are accepted as input almost anywhere where 
>> doubles are.
>> std.conv recognizes the string "nan" and converts it to 
>> double.nan,
>
> They are still singular values. Code has no business comparing 
> them unwittingly.
>
>> 2) It is unreasonable to expect every D user out there dealing 
>> with
>> floating point numbers to be forced to check every source of 
>> floating
>> point values in their program against NaNs.
>
> I find that perfectly reasonable. How do you mean that?

Both of the above hinge on the relative obscurity of NaNs and the 
problems they could cause. People who are not specifically 
familiar with NaNs and how they interact with other 
floating-point values will treat floating-point values as "just 
numbers", and expect them to compare and sort in the same way as 
integers.

> The language allows NaN floating point number with the 
> semantics of IEEE 754. We hardly have any leeway in the matter 
> unless we want to irate a lot of people for no good reason.

> It's a judgment call, not a cut and dried decision.

Agreed - however, I think there might be more than one way to 
deal with this.

1) As above, introduce a "less" function which guarantees 
transitivity for basic types, and use it in examples throughout.

2) Improve documentation and visibility of this problem. For 
example, add this to the documentation of sort:

"The predicate must be transitive (`a<b && b<c` implies `a<c`) in 
order for `sort` to behave properly - otherwise, the program may 
fail on certain inputs (but not others) when not compiled in 
release mode. Note that `a<b` is not transitive for floating 
points, because the expression will always be `false` when either 
`a` or `b` is `NaN`."

> We should simply add an assert to isSorted.

How would this be implemented, anyway? I stumbled upon this 
problem with this code:

enum sortPred = q{a.hits > b.hits || (a.hits == b.hits && a.value 
< b.value)};
all.sort!sortPred();

(I know about multiSort, but it currently seems to be buggy - I 
do have a test case, but it has a few million entries and I need 
to reduce it.)

> if (a[i] < a[i + 1]) { assert(!(a[i + 1] < a[i])); ... }
> else { assert(a[i + 1] >= a[i]); ... }

Sort et al don't know how to check "a>=b"... and they can't use 
"a<b" either, because !(a<b) && !(b<a) implies a==b.