safety model in D

Thu Nov 5 06:48:46 PST 2009

Don wrote:
> Rainer Deyke wrote:
>> Andrei Alexandrescu wrote:
>>> I hear what you're saying, but I am not enthusiastic at all about
>>> defining and advertising a half-pregnant state. Such a language is the
>>> worst of all worlds - it's frustrating to code in yet gives no guarantee
>>> to anyone. I don't see this going anywhere interesting. "Yeah, we have
>>> safety, and we also have, you know, half safety - it's like only a lap
>>> belt of sorts: inconvenient like crap and doesn't really help in an
>>> accident." I wouldn't want to code in such a language.
>>
>> Basically you're saying that safety is an all or nothing deal.  Not only
>> is this in direct contradiction to the attempts to allow both safe and
>> unsafe modules to coexist in the same program, it is in contradiction
>> with all existing programming languages, every single one of which
>> offers some safety features but not absolute 100% safety.
>>
>> If you have a formal definition of safety, please post it.  Without such
>> a definition, I will use my own informal definition of safety for the
>> rest of this post: "a safety feature is a language feature that reduces
>> programming errors."
>>
>> First, to demonstrate that all programming languages in existence offer
>> some safety features.  With some esoteric exceptions (whitespace, hq9+),
>> all programming languages have a syntax with some level of redundancy.
>> This allows the language implementation to reject some inputs as
>> syntactically incorrect.  A redundant syntax is a safety feature.
>>
>> Another example relevant to D: D requires an explicit cast when
>> converting an integer to a pointer.  This is another safety feature.
>>
>> Now to demonstrate that no language offers 100% safety.  In the
>> abstract, no language can guarantee that a program matches the
>> programmer's intention.  However, let's look at a more specific form of
>> safety: safety from dereferencing dangling pointers.  To guarantee this,
>> you would need to guarantee that the compiler never generates faulty
>> code that causes the a dangling pointer to be dereferenced.  If the
>> program makes any system calls at all, you would also need to guarantee
>> that no bugs in the OS cause a dangling pointer to be dereferenced.
>> Both of these are clearly impossible.  No language can offer 100% safety.
>>
>> Moreover, that safety necessarily reduces convenience is clearly false.
>>  This /only/ applies to compile-time checks.  Runtime checks are purely
>> an implementation issue.  Even C and assembly can be implemented such
>> that all instances of undefined behavior are trapped at runtime.
>>
>> Conversely, the performance penalty of safety applies mostly to runtime
>> checks.  If extensive testing with these checks turned on fails to
>> reveal any bugs, it is entirely reasonable to remove these checks for
>> the final release.
> 
> I'm in complete agreement with you, Reiner.
> What I got from Bartosz' original post was that a large class of bugs 
> could be eliminated fairly painlessly via some compile-time checks. It 
> seemed to be based on pragmatic concerns. I applauded it. (I may have 
> misread it, of course).
> Now, things seem to have left pragmatism and got into ideology. Trying 
> to eradicate _all_ possible memory corruption bugs is extremely 
> difficult in a language like D. I'm not at all convinced that it is 
> realistic (ends up too painful to use). It'd be far more reasonable if 
> we had non-nullable pointers, for example.
> 
> The ideology really scares me, because 'memory safety' covers just one 
> class of bug. What everyone wants is to drive the _total_ bug count 
> down, and we can improve that dramatically with basic compile-time 
> checks. But demanding 100% memory safety has a horrible cost-benefit 
> tradeoff. It seems like a major undertaking.
> 
> And I doubt it would convince anyone, anyway. To really guarantee memory 
> safety, you need a bug-free compiler...

I protest against using "ideology" when characterizing safety. It 
instantly lowers the level of the discussion. There is no ideology being 
pushed here, just a clear notion with equally clear benefits. I think it 
is a good time we all get informed a bit more.

First off: _all_ languages except C, C++, and assembler are or at least 
claim to be safe. All. I mean ALL. Did I mention all? If that was some 
ideology that is not realistic, is extremely difficult to achieve, and 
ends up too painful to use, then such theories would be difficult to 
corroborate with "ALL". Walter and I are in agreement that safety is not 
difficult to achieve in D and that it would allow a great many good 
programs to be written.

Second, there are not many definitions of what safe means and no ifs and 
buts about it. This whole wishy-washy notion of wanting just a little 
bit of pregnancy is just not worth pursuing. The definition is given in 
Pierce's book "Types and Programming Languages" but I was happy 
yesterday to find a free online book section by Luca Cardelli:

http://www.eecs.umich.edu/~bchandra/courses/papers/Cardelli_Types.pdf

The text is very approachable and informative, and I suggest anyone 
interested to read through page 5 at least. I think it's a must for 
anyone participating in this to read the whole thing. Cardelli 
distinguishes between programs with "trapped errors" versus programs 
with "untrapped errors". Yesterday Walter and I have had a long 
discussion, followed by an email communication between Cardelli and 
myself, which confirmed that these three notions are equivalent:

a) "memory safety" (notion we used so far)
b) "no undefined behavior" (C++ definition, suggested by Walter)
c) "no untrapped errors" (suggested by Cardelli)

I suspect "memory safety" is the weakest marketing terms of the three. 
For example, there's this complaint above: "'memory safety' covers just 
one class of bug." But when you think of programs with undefined 
behavior vs. programs with entirely defined behavior, you realize what 
an important class of bugs that is. Non-nullable pointers are mightily 
useful, but "no undefined behavior" is quite a bit better to have.

The argument about memory safety requiring a bug-free compiler is 
correct. It was actually aired quite a bit in Java's first years. It can 
be confidently said that Java won that argument. Why? Because Java had a 
principled approach that slowly but surely sealed all the gaps. The fact 
that dmd has bugs now should be absolutely no excuse for us to give up 
on defining a safe subset of the language.

Andrei