Proofs of Programs

(For anyone who is interested, "The Science of Programming", by D. Gries, Springer Verlag, is all about this stuff)

The basic idea is that we want to produce logical formulæ that exactly describe the behaviours of programs or pieces of programs. If we have a guaranteed correct way of producing these formulæ, we will have a way of proving programs correct (i.e. proving that they do what they are supposed to do) beyond any possible doubt.

If S is a statement, group of statements, or even a whole program, and P and Q are logical expressions, then: {P}S{Q} means "if P is true immediately before the statement S is executed, Q will be true immediately afterwards".
The similar statement: {W}=wp(S,{Q}) means "if you want Q to be true immediately after the statement S is executed, you absolutely must ensure that W is true immediately before".
        The two expressions are very similar, but not the same. W is the Weakest Precondition that will guarantee the desired results; P is just some particular precondition that guarantees the desired results. "Weakest" means "least restrictive"; P may include conditions that we know are true in a particular case, but are not strictly necessary to guarantee the results; W may not include any unnecessary conditions. For example: wp(x=y+1, {x>10}) = {y>9}
{y=12 && x=7} x=y+1 {x>10} are both true. The first says that if you want (x>10) to be true after executing x=y+1, you must ensure that (y>9) before; nothing more is needed, nothing less is good enough. The second says that if (y=12) and (x=7) when you start executing x=y+1, then (x>10) will be true afterwards. Both statements are correct.
        The general rule is that if we have both {P}S{Q} and {W}=wp(S,{Q}) then P=>W. ("=>" is the logical symbol for "implies". "A implies B" means that whenever A is true, B is also true, but not necessarily vice-versa). This means that we only have to work out rules for calculating wp, the pickiest one. Once we have the formula for W, any P that is only true when W is true will do.
        Curly { brackets } appear around logical formulæ just to make them stand out; they do not add any particular meaning.

The rules

SKIP

The simplest statement of all is SKIP (in invented statement to give us somewhere to start) SKIP does nothing at all, so anything that you want to be true after executing it must already have been true before, so: wp(SKIP,{Q}) = {Q}

Assignment

The assignment statement is next. The assignment x=V replaces the value of x with V, so anything that we want to be true about x must already have been true about V. This gives the rule: wp(x=V, {Q}) = {Q[with every occurrence of x replaced by V]} This doesn't always seem quite right at first sight, but some examples always clear up any doubts:
 wp(x=5, {y>x}) = {y>5}. If you want y to be more than x after making x be 5, you must ensure that y is more than 5 before.
 wp(x=x+1, {x=10}) = {(x+1)=10} = {x=9}. If you are going to add one to x, and you want x to be 10, you need x to be 9 now.
 wp(x=sqrt(y), {x>12 && y>7}) = {sqrt(y)>12 && y>7} = {y>144 && y>7} = {y>144}
If you are going x become the square root of y, and you want x to be more than 12, you must make sure that y's square root is more than 12, which means that y must be more than 144. If you also want to ensure that y is more than 7, it turns out that that will be taken care of anyway.
A few more examples should work without further explanation:
 wp(x=y+z, {y>x}) = {y>y+z} = {z<0}.
 wp(x=y, {y=x}) = {true} i.e. inevitable.
 wp(x=3, {y>z}) = {y>z}.
 wp(x=x/y, {x>6}) = {x/y>6} = {x>6y}.
The last example illustrates that the basic rule isn't quite complete. Starting out with the knowledge that x is more then 6 times y, is not enough to guarantee that x=x/y will make x be more than 6. What if y is zero? Then the statement won't work, and you won't be able to say anything about the results at all.

There is a special formula Dom(f); Dom stands for domain, and f can be any formula. Dom(f) gives the conditions under which evaluating f will not go wrong. It has a simple resursive definition:
 Dom(variable) = {true}
 Dom(constant) = {true}
 Dom(A+B) = Dom(A) && Dom(B)
 Dom(A-B) = Dom(A) && Dom(B)
 Dom(A*B) = Dom(A) && Dom(B)
 Dom(A/B) = Dom(A) && Dom(B) && B!=0
 Dom(sqrt(A)) = Dom(A) && A>=0
 etc. etc. etc.
So, for example, Dom(x+3*y) = {true}, it just can't go wrong.
 Dom(x+sqrt(x)) = {x>=0),
 Dom(1/x+2/(x+1)) = {x!=0 && x!=1),
 There are special rules for logical expressions, because sometimes only one operand is evaluated. In (A&&B), A is always evaluated; if it turns out to be false, there is no point in evaluating B, it can't make any difference to the result. So the requirement is that A must be safe, but then either the value of A must be false, or B must be safe, so we get:
 Dom(A && B) = Dom(A) && (!A || Dom(B))
 Dom(A || B) = Dom(A) && (A || Dom(B))
Whenever an expression is to be evaluated, Dom must be satisfied for any results to be guaranteed, so the complete rule for assignment is: wp(x=V, {Q}) = Dom(V) && {Q[with every occurrence of x replaced by V]} And the last example is correctly given as:
 wp(x=x/y, {x>6}) = {y!=0 && x>6y}.

Sequential Composition

If two statements are to be executed, one after the other, then the first statement is responsible for getting us into a position from which the second statement is guaranteed to deliver the desired results. That means that is we want Q to be true after S1;S2; is executed, we first work out W=wp(S2,{Q}), so W is the absolute requirements for S2 to deliver Q. S1 must be able to make Q true, so the initial requirement is wp(S1,{W}). Or to put it completely: wp(S1;S2;, {Q}) = wp(S1, wp(S2, {Q})) Examples:
        wp(x=1;y=2, {x>y}) = wp(x=1, {x>2}) = {1>2} = {false} i.e. impossible
        wp(x=a;y=b, {y>3}) = wp(x=a, {b>3}) = {b>3}
        wp(x=x+1;x=x*2, {x=k}) = wp(x=x+1, {x*2=k}) = {(x+1)*2=k} = {x=k/2-1}

Conditionals

The general form of a conditional statement is "if COND then S1 else S2". For an "elseless" if statement just make S2 be SKIP. If executing that statement is to make Q be true, then of course it must be possible to evaluate COND without disaster, so Dom(COND) must be true. Then either S1 or S2 will be responsible for making Q be true, so we will require either wp(S1,Q) or wp(S2,Q); the choice being detrmined by whether or not the condition is true. The rule we obtain is: wp(if COND then S1 else S2, {Q}) = Dom(COND) && (COND && wp(S1,Q) || !COND && wp(S2,Q)) or wp(if COND then S1, {Q}) = Dom(COND) && (COND && wp(S1,Q) || !COND && Q)
Example: finding the maximum in a single statement. (The logical formula that expresses the fact that M=max(A,B) is (M=A || M=B) && M>=A && M>=B; it is not good enough for M to be greater than A and B, it must be equal to one of them.)

 wp(IF a>b THEN m=a ELSE m=b, {(m=a || m=b) && m>=a && m>=b})

 = (a>b && wp(m=a, {(m=a || m=b) && m>=a && m>=b})
 || (a<=b && wp(m=b, {(m=a || m=b) && m>=a && m>=b})

 = (a>b && (a=a || a=b) && a>=a && a>=b)
 || (a<=b && (b=a || b=b) && b>=a && b>=b)

 = (a>b && (a=a || a=b) && a>=a && a>=b)
 || (a<=b && (b=a || b=b) && b>=a && b>=b)

 = (a>b || a<=b)

 = {true}

Doing the same example in a slightly different way, with a single-branch if, we see a different procedure, but of course the same results:

 wp(m=a; IF m=a && m>=b})

 = wp(m=a, (a=a && m>=b}))
 || (a>=b && (m=a || m=b) && m>=a && m>=b))

 = wp(m=a, (a=a && b>=b)
 || (a>=b && (m=a || m=b) && m>=a && m>=b))

 = a=a && b>=b
 || a>=b && (a=a || a=b) && a>=a && a>=b

 = (a=b)

 = {true}

Loops

The direct way to reason about a loop, and calculate wp(WHILE cond DO body, {Q}), is based on the number of times it is executed before it finishes. The loop could be executed:
        No times at all, in which case the condition must be false, and the requirements must already be satisfied: L₀=Dom(cond) && !cond && Q.
        or Exactly once, in which case the condition cond must be true, and executing the body once must get us into a state in which the loop will execute no more but still deliver the result: L₁=Dom(cond) && cond && wp(body, L₀).
        or Exactly twice, in which case the condition must be true, and executing the body once must get us into a state in which the loop will execute exactly once more and deliver the result: L₂=Dom(cond) && cond && wp(body, L₁).
        etc. The general case is: L_n+1=Dom(cond) && cond && wp(body, L_n), giving wp(WHILE cond DO body, {Q}) = L₀ || L₁ || L₂ || L₃ || L₄ || ... Unfortunately, this infinite formula is just too big to be usable. Instead, a less direct approach is taken, based upon some knowledge of how the loop is supposed to behave.

We must design first an INVARIANT: a formula that is always true every time around the loop, and describes in some way what the loop is supposed to do. Examples:
A loop that searches an array looking for a particular value:

i=0; found=0; while (i<n && !found) { if (a[i]==value) found=1; i=i+1; } The loop is designed so that if found is false, all elements of the array before position i have been searched, and are definitely not equal to value. In other words, if found is true a[i] must= value, if found is false then for any x<i, a[x]!=value, giving:
INV = (found && a[i]=V) || (!found && x a[x]!=V)

A loop that adds up the first 10 elements of an array:

i=0; s=0; while (i<10) { s=s+a[i]; i=i+1; } The loop is designed so that i represents how far through the calculation we are, and s is always correct so far (meaning s is the result of adding up the first i elements), giving:
INV = s=sum(a,0,i-1)
(the first i elements are a[0], a[1], ..., a[i-1], not including a[i])

(with the obvious definition, sum(a,X,Y)=a[X]+a[X+1]+...+a[Y]. If X=Y sum(a,X,Y)=a[X]. If X>Y sum(a,X,Y)=0.

We shall stick with the second example because the first one gets a bit hairy. We can prove that the body of the loop leaves the invariant true if it was true to start with. Another way of saying this is that the invariant itself is a sufficient condition to ensure that executing the body will make the invariant true. This means we should attempt to prove (cond && INV)=>wp(body,INV). Note the use of implies rather than equals (We only need to show that we have satisfied the weakest conditions that guarantess the desired result), and the appearance of "cond" (The body will only ever be executed if cond is true, so we can make use of and useful information that cond contains)
For the second example:
 wp(s=s+a[i]; i=i+1, INV)
 wp(s=s+a[i]; i=i+1, {s=sum(a,0,i-1)})
 = wp(s=s+a[i], {s=sum(a,0,i)})
 = {s+a[i]=sum(a,0,i)}
 = {s+a[i]=a[0]+a[1]+...+a[i-1]+a[i]}
 = {s=a[0]+a[1]+...+a[i-1]}
 = {s=sum(a,0,i-1)}
 = INV
So in this case, INV=>wp(s,INV) is certainly true, so (cond && INV)=>wp(s,INV) must also be true.

For any loop "WHILE cond DO body" if we pick an invariant and prove that (cond && INV)=>wp(body,INV), we have proved that so long as the invariant is true when we reach the loop, if the loop ever finishes the invariant will still be true. If we select an invariant that somehow embodies the correctness of the loop, all we need to do is prove that the loop will stop, and we are finished.
 We can see that the invariant successfully embodies the correctness of the loop. s=sum(a,0,i-1) says that while we are still running the loop, the result (in s) is correct as far as we have gone. Once the loop finishes, if we can somehow establish that i must be 10, we will have a final result of s=sum(a,0,9) which describes the desired result of the whole program. It is important to choose an invariant that works in this way.

We have proved that the invariant remains true if it starts that way. Now we must prove that it does start that way. In other words, that the statements before the loop are guaranteed to make the invariant true, In other words wp(i=0; s=0, {INV}) = {true}.
This one is easy:
 wp(i=0; s=0, {INV})
 = wp(i=0; s=0, {s=sum(a,0,i-1)})
 = wp(i=0;, {0=sum(a,0,i-1)})
 = {0=sum(a,0,-1)}
 = {0=0}
 = {true}

The next step is to design a VARIANT: some expression that is absolutely guaranteed to change each time round the loop. If we arrange for the variant to always get smaller by some minimal amount, and ensure that when the variant is reduced to some minimum value, the loop must stop (i.e. cond becomes false), we will have proved that our program will eventually stop. Traditionally, the variant is some integer-valued expression, which can not reach zero without making the loop condition false. So, the requirements for a variant that must be proved are:
1: It must get smaller each time the loop body is executed:
 wp(body, {VAR=x}) => {VAR>x}
2: It must force the loop condition to be false if it reaches zero:
 {VAR<0} => {!cond}

For our second example, there is a very clear choice for the variant: VAR = 10-i
The proofs are as follows:
1: It gets smaller each time:
 wp(body, {VAR=x})
 = wp(s=s+a[i]; i=i+1, {10-i=x})
 = wp(s=s+a[i], {10-(i+1)=x})
 = {10-(i+1)=x}
 = {9-i=x}
and 9-i=x certainly implies that 10-i>x, so we have the required wp(body,{VAR=x})=>{VAR>x}.
2: When it reaches zero the loop condition must be false:
 VAR<=0
 = (10-i)<=0;
 = 10<=i;
 => !(i<10).
So now we have proof that the loop must eventually stop.

 However, we have not got enough to establish that the program is correct. When the loop terminates, we know that the loop condition is false, and the invariant is true: i>=10 && s=sum(a,0,i-1). In other words, s will contain the sum of at least the first ten elements of the array, but mabe some more too.
 The problem is simply that we neglected to establish that i will be exactly 10 when the loop stops. This is easy to handle. An important thing about our loop is that it never make i exceed 10. That was such a fundamental and obvious feature of the design that it was overlooked in designing the invariant. It is very easy for this to happen; usually program provers make a few attempts at a proof, when one attempt falls short, we normally expect the failure to give some clues, as it did here.
The invariant should have been INV = s=sum(a,0,i-1) && i<=10.
(note <=, not just <, because the loop does allow i to reach ten, even though it stops immediately on that happening. The loop is supposed to stop with i equal to 10).
 Now we must re-do the proofs with the new invariant, fortunately (and usually) the new invariant has little effect on the proofs, and we can almost just copy the old ones:
First, the invariant is maintained by the loop:
 wp(s=s+a[i]; i=i+1, INV)
 wp(s=s+a[i]; i=i+1, {s=sum(a,0,i-1) && i<=10})
 = wp(s=s+a[i], {s=sum(a,0,i) && i<=9})
 = {s+a[i]=sum(a,0,i) && i<=9}
 = {s+a[i]=a[0]+a[1]+...+a[i-1]+a[i] && i<=9}
 = {s=a[0]+a[1]+...+a[i-1] && i<=9}
 = {s=sum(a,0,i-1) && i<=9}
 INV && cond = {s=sum(a,0,i-1) && i<10}
 which is identical. (if i is an integer, {i<=9} = {i<10})
 So again, INV=wp(s,INV) is certainly true, so (cond && INV)=>wp(s,INV) must also be true.
Second, the invariant starts true:
 wp(i=0; s=0, {INV})
 = wp(i=0; s=0, {s=sum(a,0,i-1) && i<=10})
 = wp(i=0;, {0=sum(a,0,i-1) && i<=10})
 = {0=sum(a,0,-1) && 0<=10}
 = {0=0 && 0<=10}
 = {true}
Finally, the the fact that the variant demonstartes that the loop stops does not need to be rechecked beacuse neither the variant not the loop have been modified.

Now, we know that at the end of the loop, the invariant is true, and the loop condition is false:
 s=sum(a,0,i-1) && i<=10 && !(i<10)
 = s=sum(a,0,i-1) && i<=10 && i>=10
 = s=sum(a,0,i-1) && i=10
 => s=sum(a,0,9)
Which is exactly what we wantde all along, so the program is proved correct.

Summary

1. Select a variant and an invariant,
2. Prove that the invariant is maintained by the loop:
 {cond && INV} => wp(body,INV),
3. Prove that the invariant is set up before the loop:
 wp(pre-loop,INV) = {true},
4. Prove that the variant is reduced by the loop:
 {cond && wp(body,{VAR=x})} => {VAR>x},
5. Prove that the loop must stop:
 {VAR<=0} => {!cond},
6. Prove that when the loop stops, your desired results are ready:
 {INV && !cond} => {desires}.

Or, by jiggling the stages around a little, we get:

wp(preloop; WHILE cond DO body, {Q})	=	wp(preloop, {INV})
	&&	{cond && INV} => wp(body,INV)
	&&	{cond && wp(body,{VAR=x})} => {VAR>x}
	&&	{VAR<=0} => {!cond}
	&&	{INV && !cond} => {Q}