floats default to NaN... why?

F i L witte2008 at gmail.com
Sun Apr 15 19:38:24 PDT 2012


Jerome BENOIT wrote:
>> Just because mathematical equations behave differently with 
>> zero doesn't change the fact that zero _conceptually_ 
>> represents "nothing"
>
> You are totally wrong: here we are dealing with key concept of 
> the group theory.

Zero is the starting place for any (daily used) scale. Call it
what you want, but it doesn't change the fact that we *all*
understand "zero" in a _basic_ way. And that IS my point here. It
is natural to us to start with zero because that's what the
majority of us have done throughout our entire lives. I respect
the fact that you're a Theoretical Physicist and that your
experience with math must be much different than mine. I'm also
convinced, because of that fact, that you're more part of the
corner case of programmers rather than the majority.


>> It's default for practical reason. Not for mathematics sake, 
>> but for the sake of convenience. We don't all study higher 
>> mathematics  but we're all taught to count since we where 
>> toddlers. Zero makes sense as the default, and is compounded 
>> by the fact that Int *must* be zero.
>>
>
> The sake of convenience here is numerical practice, not coding 
> practice: this is the point:
> from numerical folks, zero is a very bad choice; NaN is a very 
> good one.

I disagree. Coding is much broader than using code to write
mathematical equations, and the default should reflect that. And
even when writing equations, explicitly initializing variables to
NaN as a debugging practice makes more sense than removing the
convenience of having a usable default in the rest of your code.


>>> 0 / 0 = NaN // undefined
>>
>> Great! Yet another reason to default to zero. That way, "0 / 
>> 0" bugs have a very distinct fingerprint.
>
> While the other (which are by far more likely) are bypassed: 
> here you are making a point against yourself:
>
> NaN + x = NaN
> NaN * x = NaN
> x / NaN = NaN
> NaN / x = NaN

This was intended more as tongue-in-cheek than an actual
argument. But your response doesn't address what I was getting at
either. That is: Debugging incorrectly-set values is actually
more complicated by having NaN always propagate in only some
areas. I gave code examples before showing how variables can just
as easily be incorrectly set directly after initialization, and
will therefor leave a completely different fingerprint for a
virtually identical issue.


>>> , which is inline with how pointers behave (only applicable 
>>> to memory, not scale).
>>>
>>> pointer value are also bounded.
>>
>> I don't see how that's relevant.
>
> Because then zero is a meaningful default for pointers.

I'm not trying to be difficult here, but I still don't see what
you're getting at. Pointers are _used_ differently than values,
so a more meaningful default is expected.


>>> Considering the NaN blow up behaviour, for a numerical folk 
>>> the expected behaviour is certainly setting NaN as default 
>>> for real.
>>> Real number are not meant here for coders, but for numerical 
>>> folks:
>>
>> Of course FP numbers are meant for coders... they're in a 
>> programming language. They are used by coders, and not every 
>> coder that uses FP math *has* to be well trained in the finer 
>> points of mathematics simply to use a number that can 
>> represent fractions in a conceptually practical way.
>>
> The above is not finer points, but basic ones.
> Otherwise, float and double are rather integers than by 
> fractions.

I don't understand what you wrote. Typo?

NaN as default is purely a debugging feature. It's designed so
that you don't miss setting a [floating point] variable (but can
still incorrectly set it). My entire arguments so far has been
about the expected behavior of default values being usable vs.
debugging features.


>>> D applies here a rule gain along experiences from numerical 
>>> people.
>>
>> I'm sorry I can't hear you over the sound of how popular Java 
>> and C# are.
>
> Sorry, I can't hear you over the sound of mathematics.

That doesn't make any sense... My bringing up Java and C# is
because they're both immensely popular, modern languages with
zero-defaulting FP variables. If D's goal is to become more
mainstream, it could learn from their successful features; which
are largely based around their convenience.

We already have great unittest debugging features in D (much
better than C#), we don't need D to force us to debug in
impractical areas.


>  Convenience is about productivity, and that's largely 
> influence by how much prior knowledge someone needs before 
> being able to understand a features behavior.
>
> Floating point calculus basics are easy to understand.

Sure. Not having to remember them at all (for FP variables only)
is even easier.


>>> For numerical works, because 0 behaves nicely most of the 
>>> time, non properly initialized variables may not detected 
>>> because the output data can sound resoneable;
>>> on the other hand, because NaN blows up, such detection is 
>>> straight forward: the output will be a NaN output which will 
>>> jump to your face very quickly.
>>
>> I gave examples which address this. This behavior is only 
>> [debatably] beneficial in corner cases on FP numbers 
>> specifically. I don't think that sufficient justification in 
>> light of reasons I give above.
>
> This is more than sufficient because the authority for floating 
> point (aka numerical) stuff is hold by numerical folks.

Again, it's about debugging vs convenience. The "authority"
should be what the majority _expect_ a variable's default to be.
Given the fact that variables are created to be used, and that
Int defaults to zero, and that zero is used in *everyone's* daily
lives (conceptually), I think usable values (and zero) makes more
sense.

Default to NaN explicitly, as a debugging technique, when you're
writing mathematically sensitive algorithms.


>>> This is a numerical issue, not a coding language issue.
>>
>> No, it's both.
>
> So a choice has to be done: the mature choice is NaN approach.

The convenient choice is defaulting to usable values. The logical
choice for the default is zero. NaN is for debugging, which
should be explicitly defined.


>  We're not Theoretical physicists
>
> I am

That commands an amount of respect from me, but it also increases
my belief that you're perspective on this issue is skewed. D
should be as easy to use and understand as possible without
sacrificing efficiency.


>  we're Software Engineers writing a very broad scope of 
> different programs.
>
> Does floating point calculation belong to the broad scope ?

Yes. You only need an elementary understanding of math to use a
fraction.


> Do engineers relay on numerical mathematician skills  when they 
> code numerical stuff, or on pre-calculus books for grocers ?

Depends on what they're writing. Again, it's not a mathematical
issue, but a debugging vs convenience one.


>>> Personally in my C code, I have taken the habit to initialise 
>>> real numbers (doubles) with NaN:
>>> in the GSL library there is a ready to use macro: GSL_NAN. 
>>> (Concerning, integers I used extreme value as INT_MIN, 
>>> INT_MAX, SIZE_MAX. ...).
>>
>> Only useful because C defaults to garbage.
>
> It can be initialized by 0.0 as well.

My point was that in C you're virtually forced to explicitly
initialize your values, because they're unreliable otherwise. D
doesn't suffer this, and could benefit from a more usable default.


More information about the Digitalmars-d-learn mailing list