templatizing parameters and induction (was Re: Writing Bug-Free C/D Code)

Thu Mar 22 21:41:08 PDT 2007

Dan wrote:
...
> What should then happen is this:  The compiler should recognize that we
 > only accept values such that x > y.  Knowing that, it should then verify
> that in all cases where this function can be called, that the input to the
 > function must have this condition.  In cases where this *can* fail, it
> should raise an error - such as variables that are affected by user input
 > or outside parameters that aren't already bounds checked or adjusted...
> 
> Theoretically, one could take any given int (between int.min and int.max)
 > and adjust it such that x > y will be true for all cases.  This can be
 > proven and verified by a proof by induction.
> 
> If you don't isolate the in/out tests from the body, a proof by induction
> couldn't know what needs to be proved to prove the code safe.

I think there are very few cases where this kind of induction could 
reasonably be done, unless x and y are literals.  I suspect that in most 
cases where the compiler had enough info to do this kind of induction 
checking, the code for the function could nearly as easily be "constant 
folded" and optimized away entirely.  Prove me wrong, but I suspect that 
this kind of induction is too hard (and in the cases where it works, too 
slow) to be practical.

If x and y are literals, though, you can do something like this now:

Original source:

int foo(int x, int y, int z)
{
     assert( x > y ); // runtime
...
}

int z = foo(10, 20, 30);

Checking at compile time:

int foo(int x, int y)(int z)
{
     static assert( x > y ); // compile time
...
}

int z = foo!(10, 20)(30);

It's a very simple transformation --- take any compile-time known values 
and make them template parameters.  In this case, the calling syntax 
even looks nearly the same!

For most of the practical cases where something is induction testable at 
compile time, this technique should get what you are suggesting.  You 
are marking the path of x and y, and the compiler will do the actual 
induction via constant folding.

If you don't have a constant 10 and 20 for x and y, then propagate the 
"templatization" of x and y up the call stack (making them template 
parameters at every level) until you get to the point where x and y are 
created or computed.

*Note: if you find a place where you can't compute x and y at compile 
time, induction would also have failed there.

Actually ... it would be interesting to fiddle with a compiler that does 
this "templatization" for all compile time known parameters, as an 
optimization --- it would have to be a global optimization of course.

I heard many years ago about an idea that you could take some of the 
input for a program and use it to optimize the entire program.  For 
example, you could take a python program and a python interpreter, and 
insert the python program into the "main" of the interpreter.  The idea 
was that you could then "optimize" the interpreter + script until both 
together were an optimized 'C' program that was equivalent to the 
script.  In theory, this works -- in real life, probably not very well.

Kevin