an old topic (pun intended)

Wed Oct 26 12:51:22 PDT 2011

     (continuing a thread from D.learn)

Eiffel supports an 'old' construct in post-conditions. For example, an 
Add method of a Bag class (generic collection container) might verify 
that exactly one element was added to the collection, not zero and not 
more than one:

     Add(newItem : T) is
     do
         -- insert newItem into the Bag
     ensure
         exactly_one_added: count = old count + 1
     end

D2 post-conditions do not provide 'old', and are less useful than they 
could be for its lack.

(Note: D post-conditions appear to be applicable to bare statements, not 
just functions. I don't see the point of statement post-conditions -- 
you can do the same thing with 'assert' -- so I focus here on function 
post-conditions.)

It has been suggested that 'unittest' largely obviates the need for 
'old'. I disagree with this view. A unit test verifies only the 
conditions that it explicitly proposes.  In contrast, a post-condition 
will exercise "real-world data" in production-level runs during testing, 
and is thus likely to trigger on conditions that the unit test never sees.

In the D.learn thread, Jonathan M Davis and Ali Çehreli comment

     JMD> IIRC, old was rejected because it added extra complication for 
little value.
      AÇ> Yes, there were technical difficulties in making the stack 
frame at the call point available to the stack frame at the exit point.

Another difficulty is examining the post-condition code for 'old expr' 
or 'old(expr)' and hoisting those elements out of the middle of an 
expression for prestate capture.

I want to propose a syntax extension that, it seems to me, makes it 
easier for the compiler to do all of this.

In D2, a post-condition is introduced with either 'out(result)' or 
'out(...)' [the latter now elided(??) to merely 'out']. That is, 'out' 
has what appears to be something like an argument list. Let's use it:

     void foo(T param_T, Q param_Q)
     out(result, X preX = exprX, Y preY = exprY) {   // <== prestate 
captures added to out "argument" list HERE
         // various assert()s using the captured prestates
     }
     body {
         // ..actual body of foo()
     }

A lowering of this would look something like

     void foo(T param_T, Q param_Q) {
         X __fresh_symbol_for_preX = exprX;
         Y __fresh_symbol_for_preY = exprY;
         // ..actual body of foo()
         assert(__postcondition_code);
     }

That is, the compiler would hoist the prestate variables (and the code 
that sets their values) out of the "argument" list, and inject them into 
the function body at its very start, much like a template instantiation.

Notes:

     +  "erasing" the actual symbols in the argument list and replacing 
them with compiler-generated fresh symbols hides those symbols from the 
function body code (which should not know of them)
     +  but those captured prestate-values exist at the outermost scope 
of the function body, just as the returned value of the function does. 
If we can pass the magic symbol 'result' to the post-condition, we 
should be able to pass
the captured prestate
     +  the scope and condition-at-time-of-evaluation for exprX and 
exprY are exactly what we want: they have access to anything that the 
function body can access (such as variables in enclosing functions, 
class members etc.) but NOT to
local declarations (which they should not be able to see) because those 
haven't been declared yet; they are part of the function body code that 
follows the injected prestate-capture code
     +  the compiler is relieved of the burden of sniffing out "old" 
sub-elements of expressions in the post-condition body and figuring out 
exactly what bits of frame and scope need to be preserved to compute 
them. The onus is on
the programmer to declare what prestate he wants to verify, and how to 
capture it, to whatever level of complexity he wants or needs to employ
     +  exprX etc. can use any syntax element that is legal within the 
function body (since it is hoisted and injected *into* the function body 
by the compiler), including things like lambda functions, etc. (In 
practice, "old"
expressions in post-conditions tend to be pretty tame.)
     +  the programmer can name the captured prestate values with any 
symbols he wishes. This preserves some of the flavor of Eiffel's 
per-post-condition labels.
     +  capture of prestates is separated from evaluation of 
post-function conditions *using* those prestates. This may make it more 
obvious what's being tested and how.
     +  no new 'old' keyword or magic 'old()' function needs to be added 
to the language, particularly one that is useful only in this isolated 
corner of the syntax

Two weird characteristics of this syntax:

     -  'result' is evaluated AFTER the function body executes, but 
prestate-captures are evaluated BEFORE the function body executes
     -  normal argument lists contain only expressions, not declarations 
of symbols to be assigned from those expressions

I think the syntax is clean and flexible enough to forgive those 
transgressions.

An example ('count' and 'color' here are members of an enclosing class, 
not shown):

     void change_count_and_color(int countIncrement, bool flipColor)
     out (result, auto preCount = count, auto previousColor = color) {
         assert(count == preCount + countIncrement);
         if (flipColor) {
             assert(color == ComplementaryColorOf(previousColor));
         }
         assert(isSensible(result));
     }
     body {
         // ..actual body of the function
     }

   =========

Comments, criticisms, jeers and guffaws cheerfully invited. (Also "if 
you're so smart, there's the source code, YOU implement it!" But then 
*I'll* guffaw.)

-- Davidson