Which D features to emphasize for academic review article

Sat Aug 11 01:57:51 PDT 2012

On Friday, 10 August 2012 at 22:01:46 UTC, Walter Bright wrote:
> It catches only a subset of these at compile time. I can craft 
> any number of ways of getting it to miss diagnosing it. 
> Consider this one:
>
>     float z;
>     if (condition1)
>          z = 5;
>     ... lotsa code ...
>     if (condition2)
>          z++;
>
> To diagnose this correctly, the static analyzer would have to 
> determine that condition1 produces the same result as 
> condition2, or not. This is impossible to prove. So the static 
> analyzer either gives up and lets it pass, or issues an 
> incorrect diagnostic. So our intrepid programmer is forced to 
> write:
>
>     float z = 0;
>     if (condition1)
>          z = 5;
>     ... lotsa code ...
>     if (condition2)
>          z++;
>
> Now, as it may turn out, for your algorithm the value "0" is an 
> out-of-range, incorrect value. Not a problem as it is a dead 
> assignment, right?
>
> But then the maintenance programmer comes along and changes 
> condition1 so it is not always the same as condition2, and now 
> the z++ sees the invalid "0" value sometimes, and a silent bug 
> is introduced.
>
> This bug will not remain undetected with the default NaN 
> initialization.

The compiler in languages like C# doesn't try to prove that the 
variable is NOT set and then emits an error. It tries to prove 
that the variable IS set, and if it can't prove that, it's an 
error.

It's not an incorrect diagnostic, it does exactly what it's 
supposed to do and the programmer has to be explicit when one 
takes on the responsibility of initialization. I don't see 
anybody complaining about this feature in C#, most experienced C# 
programmers I've talked to love it (I much prefer it too).

Leaving a local variable initially uninitialized (or rather, not 
explicitly initialized) is a good way to portray the intention 
that it's going to be conditionally initialized later. In C#, if 
your program compiles, your variable is guaranteed to be 
initialized later but before use. This is a useful guarantee when 
reading/maintaining code.

In D, on the other hand, it's possible to write D code like:

for(size_t i; i < length; ++i)
{
     ...
}

And I've actually seen this kind of code a lot in the wild. It 
boggles my mind that you think that this code should be legal. I 
think it's lazy - the intention is not clear. Is the default 
initializer being intentionally relied on, or was it 
unintentional? I've seen both cases. The for-loop example is an 
extreme one for demonstrative purposes, most examples are less 
obvious.

Saying that most programmers will explicitly initialize floating 
point numbers to 0 instead of NaN when taking on initialization 
responsibility is a cop-out - float.init and float.nan are 
obviously the values you should be going for. The benefit is easy 
for programmers to understand, especially if they already 
understand why float.init is NaN. You say yelling at them 
probably won't help - why not? I personally use 
float.init/double.init etc. in my own code, and I'm sure other 
informed programmers do too. I can understand why people don't do 
it in, say, C, with NaN being less defined there afaik. D 
promotes NaN actively and programmers should be eager to leverage 
NaN explicitly too.

It's also important to note that C# works the same as D for 
non-local variables - they all have a defined default initializer 
(the C# equivalent of T.init is default(T)). Another point is 
that the local-variable analysis is limited to the scope of a 
single function body, it does not do inter-procedural analysis.

I think this would be a great thing for D, and I believe that all 
code this change breaks is actually broken to begin with.