Contradictory justification for status quo

Fri Feb 27 13:07:28 PST 2015

On Fri, Feb 27, 2015 at 07:57:22AM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
> On 2/27/15 7:33 AM, H. S. Teoh via Digitalmars-d wrote:
> >On Fri, Feb 27, 2015 at 06:02:57AM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
> >[...]
> >>Safety is good to have, and the simple litmus test is if you slap
> >>@safe: at the top of all modules and you use no @trusted (or of course
> >>use it correctly), you should have memory safety, guaranteed.
> >[...]
> >
> >@safe has some pretty nasty holes right now... like:
> >
> >	https://issues.dlang.org/show_bug.cgi?id=5270
> >	https://issues.dlang.org/show_bug.cgi?id=8838
> >	https://issues.dlang.org/show_bug.cgi?id=12822
> >	https://issues.dlang.org/show_bug.cgi?id=13442
> >	https://issues.dlang.org/show_bug.cgi?id=13534
> >	https://issues.dlang.org/show_bug.cgi?id=13536
> >	https://issues.dlang.org/show_bug.cgi?id=13537
> >	https://issues.dlang.org/show_bug.cgi?id=14136
> >	https://issues.dlang.org/show_bug.cgi?id=14138
> >
> >There are probably other holes that we haven't discovered yet.
> 
> Yah, @safe is in need of some good TLC. How about we make it a
> priority for 2.068?

If we're going to do that, let's do it right. Let's outlaw everything in
@safe and then start expanding it by adding explicitly-vetted
operations. See below.

> >All in all, it's not looking like much of a guarantee right now.
> >It's more like a cheese grater.
> >
> >This is a symptom of the fact that @safe, as currently implemented,
> >starts by assuming the whole language is @safe, and then checking for
> >exceptions that are deemed unsafe. Since D has become quite a large,
> >complex language, many unsafe operations and unsafe combinations of
> >features are bound to be overlooked (cf. combinatorial explosion),
> >hence there are a lot of known holes and probably just as many, if
> >not more, unknown ones.
> 
> I'd have difficulty agreeing with this. The issues you quoted don't
> seem to follow a pattern of combinatorial explosion.

No, what I meant was that in an "assume safe unless proven otherwise"
system, there's bound to be holes because the combinatorial explosion of
feature combinations makes it almost certain there's *some* unsafe
combination we haven't thought of yet that the compiler currently
accepts. And it may be a long time before we discover this flaw.

This means that the current implementation almost certainly has holes
(and in fact it has quite a few known ones, and very likely more as-yet
unknown ones), therefore it's not much of a "guarantee" of safety at
all.

What I'm proposing is that we reverse that: start with prohibiting
everything, which is by definition safe, since doing nothing is
guaranteed to be safe. Then slowly add to it the things that are deemed
safe after careful review, until it becomes a usable subset of the
language. This way, we actually *have* the guarantee of safety from day
one, and all we have to do is to make sure each new addition to the list
of permitted operations doesn't introduce any new holes. And even in the
event that it does, the damage is confined because we know exactly where
the problem came from: we know that X commits in the past @safe had no
holes, and now there's a hole, so git bisect will quickly locate the
offending change.

Whereas in our current approach, everything is permitted by default,
which means the safety guarantee is broken *by default*, except where we
noticed and plugged it. We're starting with a cheese grater and plugging
the holes one by one, hoping that one day it will become a solid plate.
Why not start with a solid plate in the first place, and make sure we
don't accidentally punch holes through it?

> On another vein, consider that the Java Virtual Machine has had for
> many, many years bugs in its safety, even though it was touted to be
> safe from day one. With each of the major bugs, naysayers claimed it's
> unfixable and it belies the claim of memory safety.

Fallacy: Language X did it this way, therefore it's correct to do it
this way.

> A @safe function may assume that the code surrounding it has not
> broken memory integrity. Under that assumption, it is required (and
> automatically checked) that it leaves the system with memory
> integrity. This looks like a reasonable stance to me, and something
> I'm committed to work with.

That's beside the point. Assuming the surrounding context is safe or not
has no bearing on whether certain combinations of operations inside the
@safe function has unsafe semantics -- because the compiler failed to
recognize a certain construct as unsafe. The latter is what I'm talking
about.

> >Trying to fix them is like playing whack-a-mole: there's always yet
> >one more loophole that we overlooked, and that one hole compromises
> >the whole system. Not to mention, every time a new language feature
> >is added, @safe is potentially compromised by newly introduced
> >combinations of features that are permitted by default.
> 
> There aren't many large features to be added, and at this point with
> @safe being a major priority I just find it difficult to understand
> this pessimism.

It's not about the size of a new feature.  Every new feature, even a
seemingly small one, causes an exponential growth in the number of
language features one may put together, thereby increasing the surface
area for some feature combinations to interact in unexpected ways.
Surely you must know this, since this is why we generally try not to add
new language features if they don't pull their own weight.

The problem with this is that when compiling @safe code, the compiler is
not looking at a list of permitted features, but checking a list of
prohibited features. So by default, new feature X (along with all
combinations of it with existing language features) is permitted unless
somebody took the pains to evaluate its safety in every possible context
in which it might be used, *and* check for all those cases when
compiling in @safe mode. Given the size of the language, something is
bound to be missed. So the safety guarantee may have been silently
broken, but we're none the wiser until some unfortunate user stumbles
upon it and takes the time to file a bug. Until then, @safe is broken
but we don't even know about it.

If, OTOH, the compiler checks against a list of permitted features
instead, feature X will be rejected in @safe code by default, and we
would slowly expand the scope of X within @safe code by adding specific
instances of it to the list of permitted features as we find them. If we
miss any case, there's no problem -- it gets (wrongly) rejected at
compile time, but nothing will slip through that might break @safe
guarantees. We just get a rejects-valid bug report, add that use case to
the permitted list, and close the bug. Safety is not broken throughout
the process.

[...]
> >Rather, what *should* have been done is to start with @safe
> >*rejecting* everything in the language, and then gradually relaxed to
> >permit more operations as they are vetted to be safe on a
> >case-by-case basis.
> 
> Yah, time travel is always so enticing. What I try to do is avoid
> telling people sentences that start with "You/We should have". They're
> not productive. Instead I want to focus on what we should do starting
> now.
> 
> >See: https://issues.dlang.org/show_bug.cgi?id=12941
> 
> I'm unclear how this is actionable.
[...]

What about this, if we're serious about @safe actually *guaranteeing*
anything: after 2.067 is released, we reimplement @safe by making it
reject every language construct by default. (This will, of course, cause
all @safe code to no longer compile.) Then we slowly add back individual
language features to the list of permitted operations in @safe code
until existing @safe code successfully compiles. That gives us a
reliable starting point where we *know* that @safe is actually, y'know,
safe.

Of course, many legal things will now be (wrongly) rejected in @safe
code, but that's OK, because we will add them to the list of things
permitted in @safe code as we find them. Meanwhile, @safe actually
*guarantees* safety.  As opposed to the current situation, where @safe
sorta-kinda gives you memory safety, provided you don't use
unanticipated combinations of features that the compiler failed to
recognize as unsafe, or use new features that weren't thoroughly checked
beforehand, or do something blatantly stupid, or do something known to
trigger a compiler bug, or ... -- then maybe, fingers crossed, you will
have memory safety. Or so we hope.

T

-- 
Question authority. Don't ask why, just do it.