More on invariants

bearophile bearophileHUGS at lycos.com
Wed Mar 17 18:01:39 PDT 2010


D is the first language that I've used that has contract programming, so I've studied some about this topic.

In a recent post I have discussed about some possible (silly) new parts for the contract programming; this post is just about class/struct invariants and loop invariants.

Some of the following things come from the nice slides:
http://www.cse.yorku.ca/course/3311/slides/08-Assertions.pdf

--------------------

Loop Invariant:

A loop invariant expresses a relation that is true when the loop is entered, or entered once more. 

I've found a small example in Eiffel:

Inserting an element into a sorted singly linked list

p = pointer to the current item
pp = pointer to the list item precedent to the item pointed by p


row := matElem.row
column := matElem.column
from
  p := rowList @ row
invariant
  predecessor_relation : (pp = void and p = rowList @ row) or
                         (pp /= void and pp.next = p)
  predecessor_before_column: pp = void or pp.column < column

  -- forall k : rowList @ row .. pp :: k.column < column
  data_less_than : column_limit(rowList @ row, pp, agent less_than(?, column))
until
  p = void or p.column >= column
loop
  pp := p
  p := p.next
end




I have translated it to D2 (I have removed the semantics of matrix, and added driving code to show it works, I have not written unittests here):


import std.stdio: write, writeln;

struct Node {
    immutable int data; // the cargo can't be changed
    Node* next;
    this(const int d, Node* next_ptr=null) {
        data = d;
        next = next_ptr;
    }
}


void printList(const(Node)* p) {
    write("List: ");
    for (; p != null; p = p.next)
        write(p.data, " ");
    writeln();
}


void prependNode(ref Node* list_header, const int d)
    // list_header is a reference of a mutable pointer that's
    // the header pointer of a const list
    out {
        assert(list_header != null);
    } body {
        Node* ptr = new Node(d, list_header);
        list_header = ptr;
    }


void insertSorted(ref Node* list_header, Node* nptr)
    // nptr is a const pointer to a mutable node,
    // I don't know how to express this
    in {
        assert(nptr != null);
        assert(list_header != nptr);
        //assert(nptr.next == null); // probably too much restrictive

        // the list is sorted
        debug { // this is slow, so it's inside a debug
            if (list_header != nptr) {
                Node* p = list_header; // pointer to the current Node
                Node* pp; // pointer to the precedent Node

                while (p != null) {
                    if (pp != null)
                        assert(pp.data <= p.data);
                    pp = p;
                    p = p.next;
                }
            }
        }
    } out {
        assert(list_header != null);

        // the list is sorted and contains the given node
        debug {
            Node* p = list_header; // pointer to the current Node
            Node* pp; // pointer to the precedent Node

            while (p != null) {
                if (pp != null)
                    assert(pp.data <= p.data);
                pp = p;
                p = p.next;
            }
        }

    } body {
        Node* p = list_header; // pointer to the current Node
        Node* pp; // pointer to the precedent Node

        while (p != null && p.data < nptr.data) {
            /*invariant*/ { // loop invariant --------------------
                // predecessor relation
                assert((pp == null && p == list_header) || (pp != null && pp.next == p));

                // predecessor before data
                assert(pp == null || pp.data < nptr.data);

                // current before data
                assert(p == null || p.data < nptr.data); // extra

                // data less than
                // forall k : rowList @ row .. pp :: k.column < column
                debug { // slow
                    if (pp != null) {
                        Node* pk = list_header;
                        while (pk != pp) {
                            assert(pk.data < nptr.data);
                            pk = pk.next;
                        }
                    }
                }
            } // end loop invariant --------------------

            pp = p;
            p = p.next;
        }

        assert((pp == null && p == list_header) || (pp != null && pp.next == p));
        assert(pp == null || pp.data < nptr.data);

        if (pp != null)
            pp.next = nptr;
        else
            list_header = nptr;
        nptr.next = p;
    }


void main() {
    Node* lh;
    foreach (x; [20, 15, 12, 11, 11, 9, 3, 0])
        prependNode(lh, x);
    printList(lh); // 0 3 9 11 11 12 15 20

    insertSorted(lh, new Node(13));
    printList(lh); // 0 3 9 11 11 12 13 15 20

    insertSorted(lh, new Node(-1));
    printList(lh); // -1 0 3 9 11 11 12 13 15 20

    insertSorted(lh, new Node(30));
    printList(lh); // -1 0 3 9 11 11 12 13 15 20 30
}


I have added those debug{} because they are heavy tests that change the complexity of the code.

That code inside the debug{} is too much long and it looks like it can contain more bugs than the code it guards.


A loop invariant syntax can be useful in D because:
- It self-documents its purpose;
- You can't forget to create a new scope, this is a little safer;
- In future it can be enforced to be @readonly (that is, the programmer can be sure the code inside the invariant can't modify the variabiles in outside the scope of the invariant, but can read them);
- There can be only one loop invariant, and it must be before the body of the loop. This helps better separate the asserts from the body of the loop, and keeps things tidy.


A possible syntax for D2 for loop invariant, no new keywords necessary, and the syntax is essentially a cross between precodition and class invariant:

while (...)
invariant { } // only allowed at the top of the loop, only 1 optional invariant
body {} // body of the loop

for (...)
invariant { }
body {}



Using that syntax the loop becomes:


while (p != null && p.data < nptr.data)
    invariant {
        // predecessor relation
        assert((pp == null && p == list_header) || (pp != null && pp.next == p));

        // predecessor before data
        assert(pp == null || pp.data < nptr.data);

        // current before data
        assert(p == null || p.data < nptr.data); // extra
        
        //...
    } body {	
        pp = p;
        p = p.next;
    }



Note: this loop invariant syntax is an additive change, it can't break older D2 code, so it can be added later.

Note2: While reading Eiffel docs I have read: "Use comments if you cannot write an executable assertion". So it says that in some situations inside the conditions/invariants it's better to have a comment readable by just the programmer, than nothing.

------------------

Class Invariant:

This is a short example, a class written in what I think is Eiffel-like pseudolanguage, it shows some things/features:


class CITIZEN feature
    name, sex, age : VALUE
    spouse : CITIZEN
    children, parents : SET[CITIZEN]
    single : BOOLEAN is ensure Result iff (spouse = Void) end
    divorce is
        require  not single
        ensure  single and (old spouse).single
    end
invariant
        single or spouse.spouse = Current
        parents.count = 2
        for_all  c member_of children  it_holds
             (exists p member_of c.parents it_holds p = Current)
end


With some care all this can be written in with class invariant in D too.

--------------------

Things missing in D Contract programming:

Very useful and probably easy to do:
- class invariant has to be const (readonly instance).
- pre/post conditions can't be able to modify input arguments.

Useful:
- pre/post conditions can't be able to modify variables of outer/global scope (@readonly).
- no loop invariant.
- compact syntax (*)

Less useful:
- no loop variant.


(*) This comes from my experiments and from reading Eiffel docs. The code inside the conditions/invariants has to be short and readable. So it's better it to be quite high-level. Python is very good in this, it has list comps and other things that shorten the code a lot.
In Eiffel you can't put generic code in conditions and invariants, probably to keep them short and almost bug-free. If you want to perform more complex tests Eiffel allows you to call another function from there, that can be debugged normally. So in that D example code it's better to move the debug code to external functions.

Bye,
bearophile



More information about the Digitalmars-d mailing list