assert semantic change proposal

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Tue Aug 5 14:15:33 PDT 2014


On Tue, Aug 05, 2014 at 08:11:16PM +0000, via Digitalmars-d wrote:
> On Tuesday, 5 August 2014 at 18:57:40 UTC, H. S. Teoh via Digitalmars-d
> wrote:
> >Exactly. I think part of the problem is that people have been using
> >assert with the wrong meaning. In my mind, 'assert(x)' doesn't mean
> >"abort if x is false in debug mode but silently ignore in release
> >mode", as some people apparently think it means. To me, it means "at
> >this point in the program, x is true".  It's that simple.
> 
> A language construct with such a meaning is useless as a safety
> feature.

I don't see it as a safety feature at all.


> If I first have to prove that the condition is true before I can
> safely use an assert, I don't need the assert anymore, because I've
> already proved it.

I see it as future proofing: I may have proven the condition for *this*
version of the program, but all software will change (unless it's dead),
and change means the original proof may no longer be valid, but this
part of the code is still written under the assumption that the
condition holds. In most cases, it *does* still hold, so in general
you're OK, but sometimes a change invalidates an axiom that, in
consequence, invalidates the assertion.  Then the assertion will trip
(in non-release mode, of course), telling me that my program logic has
become invalid due to the change I made.  So I'll have to fix the
problem so that the condition holds again.


> If it is intended to be an optimization hint, it should be implemented
> as a pragma, not as a prominent feature meant to be widely used. (But
> I see that you have a different use case, see my comment below.)

And here is the beauty of the idea: rather than polluting my code with
optimization hints, which are out-of-band (and which are generally
unverified and may be outright wrong after the code undergoes several
revisions), I am stating *facts* about my program logic that must hold
-- which therefore fits in very logically with the code itself. It even
self-documents the code, to some extent. Then as an added benefit, the
compiler is able to use these facts to emit more efficient code. So to
me, it *should* be a prominent, widely-used feature. I would use it, and
use it a *lot*.

 
> >The optimizer only guarantees (in theory) consistent program
> >behaviour if the program is valid to begin with. If the program is
> >invalid, all bets are off as to what its "optimized" version does.
> 
> There is a difference between invalid and undefined: A program is
> invalid ("buggy"), if it doesn't do what it's programmer intended,
> while "undefined" is a matter of the language specification. The
> (wrong) behaviour of an invalid program need not be undefined, and
> often isn't in practice.

To me, this distinction doesn't matter in practice, because in practice,
an invalid program produces a wrong result, and a program with undefined
behaviour also produces a wrong result. I don't care what kind of wrong
result it is; what I care is to fix the program to *not* produce a wrong
result.


> An optimizer may only transform code in a way that keeps the resulting
> code semantically equivalent. This means that if the original
> "unoptimized" program is well-defined, the optimized one will be too.

That's a nice property to have, but again, if my program produces a
wrong result, then my program produces a wrong result. As a language
user, I don't care that the optimizer may change one wrong result to a
different wrong result.  What I care about is to fix the code so that
the program produces the *correct* result. To me, it only matters that
the optimizer does the Right Thing when the program is correct to begin
with. If the program was wrong, then it doesn't matter if the optimizer
makes it a different kind of wrong; the program should be fixed so that
it stops being wrong.


> >Yes, the people using assert as a kind of "check in debug mode but
> >ignore in release mode" should really be using something else
> >instead, since that's not what assert means. I'm honestly astounded
> >that people would actually use assert as some kind of
> >non-release-mode-check instead of the statement of truth that it was
> >meant to be.
> 
> Well, when this "something else" is introduced, it will need to
> replace almost every existing instance of "assert", as the latter must
> only be used if it is proven that the condition is always true. To
> name just one example, it cannot be used in range `front` and
> `popFront` methods to assert that the range is not empty, unless there
> is an additional non-assert check directly before it.

I don't follow this reasoning. For .front and .popFront to assert that
the range is non-empty, simply means that user code that attempts to do
otherwise is wrong by definition, and must be fixed. I don't care if
it's wrong as in invalid, or wrong as in undefined, the bottom line is
that code that calls .front or .popFront on an empty range is
incorrectly written, and therefore must be fixed.

If I were to implement ranges in assembly language, for example, I would
not write .front or .popFront to check for emptiness before performing
operations, because that introduces needless overhead in correct
programs. In incorrect programs, it will crash (horribly), but that's
just the consequence of the program being incorrect. Adding a check for
emptiness does not help fix incorrectness -- you will still get a wrong
result. The only difference is an exception instead of a crash -- nicer,
I'll concede, but either way, you're not getting the correct result.

Now, during development, I might add emptiness checks to my assembly
language range program for the purpose of locating incorrectly-written
parts of the program. But before I ship the product, I'd better make
darned sure that I've eliminated every bug that I can find, and
convinced myself that the program will operate correctly. If I can't
even convince myself that the program is correct, I have no business
shipping it to customers. But if I've convinced myself that it is
correct, then I might as well disable the emptiness checks so that my
product will deliver top performance -- since that wouldn't be a problem
in a correct program.


> >Furthermore, I think Walter's idea to use asserts as a source of
> >optimizer hints is a very powerful concept that may turn out to be a
> >revolutionary feature in D. It could very well develop into the
> >answer to my long search for a way of declaring identities in
> >user-defined types that allow high-level optimizations by the
> >optimizer, thus allowing user-defined types to be on par with
> >built-in types in optimizability. Currently, the compiler is able to
> >optimize x+x+x+x into 4*x if x is an int, for example, but it can't
> >if x is a user-defined type (e.g. BigInt), because it can't know if
> >opBinary was defined in a way that obeys this identity. But if we can
> >assert that this holds for the user-defined type, e.g., BigInt, then
> >the compiler can make use of that axiom to perform such an
> >optimization.  This would then allow code to be written in more
> >human-readable forms, and still maintain optimal performance, even
> >where user-defined types are involved.
> 
> This is a very powerful feature indeed, but to be used safely, the
> compiler needs to be able to detect invalid uses reliably at compile
> time.
>
> This is currently not the case:
> 
>     void onlyOddNumbersPlease(int n) {
>         assert(n % 2);
>     }
> 
>     void foo() {
>         onlyOddNumbersPlease(42);    // shouldn't compile, but does
>     }

In theory, the optimizer could use CTFE to reduce the function call, and
thereby discover that the code is invalid. We don't have that today, but
conceivably, we can achieve that one day.

But taking a step back, there's only so much the compiler can do at
compile-time. You can't stop *every* unsafe usage of something without
also making it useless. While the manufacturer of a sharp cutting tool
will presumably do their best to ensure the tool is safe to use, it's
impossible to prevent *every* possible unsafe usage of said tool. If the
user points the chainsaw to his foot, he will lose his foot, and there's
nothing the manufacturer can do to prevent this except shipping a
non-functional chainsaw. If the user insists on asserting things that
are untrue, there will always be a way to bypass the compiler's static
checks and end up with undefined behaviour at runtime.


> It would be great if this were possible. In the example of `front` and
> `popFront`, programs that call these methods on a range that could
> theoretically be empty wouldn't compile. This might be useful for
> optimization, but above that it's useful for verifying correctness.

A sufficiently-aggressive optimizer might be able to verify this at
compile-time by static analysis. But even that has its limits... for
example:

	MyRange range;
	assert(range.empty);
	if (solveRiemannHypothesis()) // <-- don't know if this is true
		range.addElements(...);

	range.popFront(); // <-- should this compile or not?


> Unfortunately this is not what has been suggested (and was evidently
> intended from the beginning)...

I don't think static analysis is *excluded* by the current proposal. I
can see it as a possible future enhancement. But the fact that we don't
have it today doesn't mean we shouldn't step in that direction.


T

-- 
I've been around long enough to have seen an endless parade of magic new
techniques du jour, most of which purport to remove the necessity of
thought about your programming problem.  In the end they wind up
contributing one or two pieces to the collective wisdom, and fade away
in the rearview mirror. -- Walter Bright


More information about the Digitalmars-d mailing list