CSC/ECE 517 Fall 2011/ch6 6a pc

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

Programming by contract is a paradigm used in software design that was first introduced by Bertrand Meyer in the 1980s.<ref>Meyer, Bertrand: Design by Contract, Technical Report TR-EI-12/CO, Interactive Software Engineering Inc., 1986</ref> Programming by contract defines a contractual relationship between the caller of a method and the method itself. It is analogous to a business relationship where both parties must meet certain obligations but also receive benefits. In terms of programming by contract, the caller is know as the "client" and the method being called is known as the "supplier".<ref>http://www.cs.usfca.edu/~parrt/course/601/lectures/programming.by.contract.html</ref> The client must ensure that certain preconditions are met before calling the supplier. The supplier benefits from not having to check the arguments itself. The supplier in return must ensure that certain postconditions are met. The client benefits from knowing that it received what it expected. Another concept, also part of programming by contract, is a class invariant. A class invariant is a condition that must be true throughout the lifetime of each instance of that class.

Programming by contract aims to increase software reliability by creating explicit specifications for each method. Contract violations can be detected early on using assertions or similar means. A contract violation indicates a potential bug in the code. Many languages allow these assertions to be used when debugging and turned off in production to increase performance.<ref>http://mindprod.com/jgloss/assertions.html</ref> Languages such as Eiffel and C# (via the System.Diagnostics.Contract library) among others have native support for contracts. C# Code Contracts and the Extended Static Checker (ESC) for Java have the ability to perform static checking at compile time to determine if a possible run-time contract exception can occur. This feature is limited due to the fact that it is equivalent to the halting problem.

Terminology

Preconditions

A precondition is a condition that must be true prior to calling a method for which it is defined. Preconditions bind the caller of that method, that is, it is the responsibility of the caller to ensure that the preconditions are satisfied. The method is not responsible, nor expected, to check the preconditions. The output from the method is not defined for the case where the preconditions are violated. In that case the method could crash the system, go into an infinite loop, or do nothing at all.

The require block constitutes a precondition in the Eiffel programming language.

set_hour (a_hour: INTEGER)
            -- Set `hour' to `a_hour'
        require
            valid_argument: 0 <= a_hour and a_hour <= are23
        do
            hour := a_hour
        ensure
            hour_set: hour = a_hour
        end

Postconditions

A postcondition is a condition that must be true after execution of the method for which it is defined. Postconditions bind the method itself. The method is responsible for ensuring that these conditions are met. The caller of the method is guaranteed that the postconditions will hold if the original preconditions were satisfied when calling the method.

The ensure block constitutes a postcondition in Eiffel.

set_hour (a_hour: INTEGER)
            -- Set `hour' to `a_hour'
        require
            valid_argument: 0 <= a_hour and a_hour <= 23
        do
            hour := a_hour
        ensure
            hour_set: hour = a_hour
        end

Invariants

A class invariant is a condition that must hold true throughout the lifetime of an object. It is a condition imposed on the class itself rather than a method of that class. Class invariants are often checked before and after the execution of each method in that class. An object found in a state where the invariant does not hold, would indicate a bug in the system.

An example of a class invariant: a linked list has an instance variable length that stores the number of nodes in the list. length should always equal the actual number of nodes in the list.

class LinkedList {
    private Node head;
    private int length;  // length stores a cached version of the number of nodes. It should always equal the actual number of nodes.
    
    public LinkedList() {
        head = null;
        length = 0;
    }

    public getLength() {
        return length;
    }

    public void Add(int value) {
        Node n = new Node(value);

        if(head == null) {
            head = n;
        }
        else {
            Node current = head;

            while(current.getNext() != null) {
                current = current.getNext();
            }
            current.setNext(n);
            n.setPrevious(current);
        }
        length++;
    }

    /* other methods omitted */
}

Along with Preconditions and Postconditions, class invariants provide another means to write good code. Class invariants come to the fore when we look at guidelines for writing methods. We typically have two types of methods : accessors, that simply retrieve a value and modifiers, that modify the object on which they are called. There is an important guideline for modifier methods to observe if they are to do their jobs well: "Public methods should always keep objects in a well-formed state.".

Here, well-formed state seems like a very subjective term. A good way to specify what makes objects of a class well-formed is to list the class invariants. A class invariant is a statement about the state of objects of the class between public method calls.

The Following example illustrates the usage and importance of class invariants. For the LinkedList class, the following could be the class invariants:

  1. The value of instance variable length is always equal to the number of elements on the list.
  2. The next instance variable of the last node on the list must have null or some other special placeholder object as its value.

With the first invariant, it is clear that we have to initialize length to 0 in the constructor and update it appropriately in add and remove methods. This will prevent any bugs from creeping into the code and in case they did, debugging becomes much simpler. Similarly the second invariant makes sure that the linked list is well terminated. Otherwise a user traversing the list could end up writing at illegal memory location. Hence class invariants are valuable tools in writing good code.

How do we identify class invariants?

Since class invariants describe the state of an object between method calls, one easy way to identify them is to look at the instance variables of the class, since instance variables hold the state of an object. This has an implication that, many instance variables can lead to many Invariants and hence make it hard to debug. One should look at the trade-off between performance and the number of class invariants when deciding on adding instance variables. For instance, in the LinkedList example, not having _length instance variable would help us reduce the number of invariants, but this would mean that user will have to traverse the list each time he needs to count the number of items on the list.

Assertions

It is useful to check preconditions, postconditions, and invariants when debugging code. A violation of which constitutes a bug in the system. This can be accomplished using an assert <ref> http://java.sun.com/developer/technicalArticles/JavaLP/assertions/ </ref> statement in Java or Contract.Requires statement in C# 4.0. These programming features can be used to alert the programmer of bugs in development and can be turned off in production. For example, suppose you have the following code in your method:

    void foo() {
        for (...) {
            if (...)
                return;
        }
        // Execution should never reach this point!!!
    } 

Rather than just leaving a comment, you could add an assert statement as follows:

    void foo() {
        for (...) {
            if (...)
                return;
        }
        assert false; // Execution should never reach this point!
    } 

This will throw an AssertionError when the program reaches that unexpected point of execution. This enables the programmer to catch any irregularities in the code at any stage of development, making the process of debugging very easy. Similarly assertions can be used to verify the preconditions, postconditions and class invariants in a method. You can assert that preconditions hold good at the start of the method, and that postcondition and class invariant hold good at the end of method execution. This ensures that contract is always being satisfied by both the supplier and the client. Of course having assert statements incurs overhead, for instance, if we have to perform some clean-up in case the post condition is not met. So most of the programming languages provide a means to disable the assertions at runtime. E.g in Java, you could use: -disableassertions or -da switch.

Note that assertions do not allow for recovery from errors; an assertion failure will normally halt the program's execution abruptly. Hence it would be unwise to use assertions where normal error handling is necessary. Assertions should only be used to document logically impossible situations and discover programming errors.

Software Reliability

A major concern in software design is Reliability, which depends on two factors: 1) Correctness - system's ability to perform its job according to the specification and 2) Robustness - handling abnormal conditions. Reliability is especially important in Object oriented software because of the additional complexity imposed by Reusability. Some of the common problems encountered when writing software include:

  • System failure - CPU crash, Disk/Memory access errors, Network errors
  • Invalid input data - Out of range data, Bad file name,etc.
  • Programming errors - Memory leaks, Buffer overruns, etc.

There are many programming language features that help tackle these issues. Static typing, for example, is a major help for catching inconsistencies before they have had time to become bugs. Similarly Garbage collection helps to remove the specter of devious memory management errors. Various software design methodologies also come handy when dealing with complex software. For instance, Reusability itself can help eliminate lots of bugs, if we are reusing existing code that has already been thoroughly tested and deployed. Polymorphism also helps in handling issues related to maintainability by reducing the size of the code and making it more elegant. But in a large and complex software involving multiple developers, we need a more systematic approach of specifying and implementing object-oriented software elements and their interactions. This is exactly what "Programming by contract" promises to do.

Programming by contract

Basically programming by contract creates a contract between the software developer and software user, which are referred to as the supplier and the consumer. Every feature, or method, starts with a precondition that must be satisfied by the consumer of the routine. And each feature ends with postconditions which the supplier guarantees to be true after it is executed(if and only if the preconditions were met). Consider the following example of a static Intersect() method on a Set object (Quoted from Skrien):

public static Set intersect(Set s1, Set s2)
  Precondition: s1 and s2 are not null.
  Postcondition: 
    Returns a Set with the common elements of s1 and s2, if s1 and s2 are not null.
    If s1 or s2 is null, an IllegalArgumentException is thrown.

Note here that this documentation is not overly specific about the implementation nor is it very general and vague. This can be stated concisely as a guideline: "The external documentation for a method should be sufficiently specific to exclude implementations that are unacceptable but sufficiently general to allow all implementations that are acceptable". are

Also, the above contract does not mention anything about the behavior of the method when preconditions are not satisfied. This is Meyer’s non-redundancy principle: "Under no circumstances shall the body of a routine ever test for the routine’s precondition". This implies that if the preconditions are not met, the behavior of the method is undefined (it could crash, throw an exception, or run in an infinite loop, etc.).

In essence, this theory suggests associating a specification with every software element. These specifications (or contracts) govern the interaction of the element with the rest of the world. This is different from having a formal specification in that the specification language is embedded in the design and programming language (e.g. Eiffel). This makes it possible to use a single notation and a single set of concepts throughout the software life cycle, and since these are modeled based on the programming language itself, it provides a great deal of clarity about the requirements at each stage of software development.

Contract and Inheritance

An important implication of Programming by Contract is when Inheritance and polymorphism are involved. Subtype polymorphism means: "If S is a subtype of T,then any term of type S can be safely used in a context where a term of type T is expected". In case of object oriented programming, type is defined by a class and inheritance enables us to define subtypes of a class (If Base class defines a type, the Derived class becomes a subtype). In this situation, the Derived class should fully honor the Base class contract. Consider the following example:

Suppose you are given a class List, that has a public interface: insert(Item const& x) . And suppose you have to implement a new class SortedList, and provide insert method with the same signature. Isn't is obvious that SortedList be made a subclass of List and override the insert() method?. The answer is no!.

To decide whether we can override the List::insert() method, we will have to look at the Contract promised by this method. Suppose the base class insert() method has advertised its contract as "Promises a copy of Item will be inserted somewhere within this List". In this case the derived class can override this, since this is a weak post-condition which can be strengthened in the derived class. So the derived class insert() can say that "Promises a copy of Item will be inserted in the sorted order within this List" and no contract is broken.

Suppose, instead the Base class method has a strong post condition: "Promises a copy of Item will be inserted at the end of this List". Then the derived class cannot override this, since derived class cannot guarantee that the Item will be inserted at the end, hence a breach of Contract.

So why bother with all this Contract business?

Consider in the above example (strong post-condition case), suppose the derived class breaches the contract and overrides insert() anyway. Now a user who was using the List::insert() method assumes that the items are added at the end, i.e sorted based on time of insertion. But suppose he accidentally calls SortedList::insert() through the base class List pointer, then the Item would be inserted somewhere in-between based on the value it contains, rather than the time of insertion. This will surely introduce hard-to-find bugs in the code, which is the problem this principle was designed to solve. Liskov's substitution principle that deals with the Base-Derived relationship can be rephrased based on this understanding:

It is acceptable to subclass B from a class A, if for every public method with identical signatures in A and B,

  1. Pre-conditions for B's method are no stronger than preconditions for A's method
  2. Post-conditions for B's method are no weaker than postconditions for A's method

Comparison with Defensive Programming

Restating the crux of "Programming by contract": "The supplier class guarantees postconditions of its methods as long as client guarantees that preconditions of those methods are met". This implies that if the client fails to satisfy the preconditions, the behavior of supplier is undefined, meaning it could result in an Exception being thrown, or a system crash.

Defensive programming on the other hand, augments this guideline as - "Try to make your methods do something appropriate in all cases so that there are no preconditions". You know errors are almost certainly going to occur and illegal input is going to be given to methods, so make sure that your methods can defend against such input by doing something explicit in a way that is helpful to the user of those methods.

However, this approach could introduce major performance hit. For instance, consider a method that performs binary search on an integer array. A precondition for such a method is that the array is sorted. To eliminate this precondition, the binary search method could first check the array to see if it is sorted and, if not, throw an exception. However, in that case the method’s efficiency is reduced from O(log n) to O(n), where n is the array size.

Languages with Native Support

Eiffel

Design by contract is central to Eiffel; it is built into the language.<ref>http://www.eiffel.com/developers/design_by_contract.html</ref> When using inheritance to override a method, one may only weaken the preconditions not strengthen them. Thus a call to any method of a subclass is guaranteed to meet the preconditions of the parent class. Conversely, postconditions can only be strengthened in an overridden method so that the results guaranteed by the parent class will still be met by the base class.<ref>http://www.eiffel.com/developers/design_by_contract_in_detail.html</ref>

Eiffel uses the require keyword to introduce preconditions and the ensure keyword to introduce postconditions. If a precondition or postcondition is violated, an exception is thrown. Exception handling is not used for control flow or for invalid data input. In Eiffel, one can handle an exception using the rescue keyword. Inside the exception handling section, the retry keyword may be used to execute the method again.

connect_to_server (server: SOCKET)
      -- Connect to a server or give up after 10 attempts.
    require
        server /= Void and then server.address /= Void
    local
        attempts: INTEGER
    do
        server.connect
    ensure
      connected: server.is_connected
    rescue
        if attempts < 10 then
            attempts := attempts + 1
            retry
        end
    end

C#

C# 4.0 has native support for code contracts via the System.Diagnostics.Contract library. Preconditions are introduced with a call to Contract.Requires and postconditions with a call to Contract.Ensures. Code contracts also allow the programmer to access the return value when declaring a postcondition.<ref>http://msdn.microsoft.com/en-us/library/dd264808.aspx</ref>

public void Add(int num1, int num2)
{
    Contract.Requires(num1 > 0);
    Contract.Requires(num2 > 0);

    return num1 + num2;

    Contract.Ensures(Contract.Result<int>() > 0);
}

Code Contracts also support class invariants. An invariant method must be decorated with the ContractInvariantMethod attribute and contain only calls to Contract.Invariant. The class invariant method is called after the execution of every public method in the class.

[ContractInvariantMethod]
protected void ObjectInvariant () 
{
    Contract.Invariant ( this.y >= 0 );
    Contract.Invariant ( this.x > this.y );
    ...
}

Static Checking

In addition to runtime contract checking, some libraries offer static checking at compile time as well. These tools can catch contract violations without even running the code. C# code contracts<ref>http://msdn.microsoft.com/en-us/magazine/hh335064.aspx</ref> and the Extended Static Checker (ESC) for Java<ref>http://www.eecs.umich.edu/~bchandra/courses/papers/Flanagan_ESC.pdf</ref> offer just that. Static analysis has theoretic limitations due to it being equivalent to the halting problem.<ref>http://www.efgh.com/math/impossible.htm</ref><ref>http://c2.com/cgi/wiki?HaltingProblemDiscussions</ref>

Topical References

<references/>