CSC/ECE 517 Fall 2010/ch2 5c gn

From Expertiza_Wiki
Revision as of 00:06, 13 December 2010 by Thatvamasi (talk | contribs) (→‎'''Dynamic Dispatch''')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Dynamic Dispatch

This wiki-page serves as a knowledge source for understanding the concept of Dynamic Dispatch in OOLS (Object oriented Languages and Systems). Here we will make an effort to address the efficiency considerations about Dynamic Dispatch, advantages of Dynamic Dispatch using parameter types and dynamic binding in CLOS (Common Lisp Object System).

Introduction

Dynamic Dispatch is the process of identifying which version of the function to call based on the runtime type of the reference being used to execute the function. It is generally used in Object-Oriented Language (OOL) [1]. For example, we can have a super class which defines a particular function and then there is a subclass which defines the same function (called as Method Overriding [2]). When a reference to the super class is used to execute that function, then which version of the function gets executed depends on the type, the reference to which it points to at that time rather than the type of the reference. In literature it also referred to in different names like Runtime Binding or Dynamic Polymorphism. [3]

To further understand what Dynamic Dispatch is, consider the below example:

class Animal {
    public void walk() {
        System.out.println("Animal::walk");
    }
}

class Dog extends Animal {
    public void walk() {
        System.out.println("Dog::walk");
    }
}
  
class Cat extends Animal {
    public void walk() {
        System.out.println("Cat::walk");
    }
}
Animal a = new _______();
a.walk();

The compiler cannot decide which version of walk (Animal, Dog or Cat) during compile time. So the decision of which version of the walk function to call will be deferred till the runtime.

Need for Dynamic Dispatch

Dynamic Dispatch is one of the inherent feature of an Object-Oriented Language [4]. In a reasonably large software based on a procedural language like C, you will come across a bunch of switch or if statement which decides which version of the function to call. Ofcourse Dynamic Dispatch can be simulated in a procedural language like C, but it involves dangerous pointer manipulation, which is usually done well by the Compilers in case of OOL like C++.

For example, consider a toll gate application which charges differently based on the car type like Sedan, SUV, Coupe etc., If we have to do this in a procedural language like C in which each classes of car represented by structure, we might have something like this.

float calculate_charge(void *car, int type)
{
  switch(type)
  {
    case SUV:
        return suv_charge(car);
    case SEDAN:
        return sedan_charge(car);
    case COUPE:
        return coupe_charge(car);
}

You can see that the code above will quickly becomes messy as we define new type of car. If we have done the same in an OOL language like Java the above statement could be replacement by a simple:

car.charge();

The function charge will be overridden in each of the subclasses (Sedan, Suv, and Coupe). From the above example it is very evident that why we need dynamic dispatch and it is obviously useful.

How Dynamic Dispatch works?

The implementation of Dynamic Dispatch varies significantly based on type language, in fact it changes even from Compiler to Compiler for a same language. We will consider how function are dispatch at runtime in both C and C++. We will try to understand if there is any overhead involved with virtual functions.

In a machine compiled language like C, program logic is compiled into underlying processor instruction set by the compiler. Incase of x86 compiler, C code will be compiled into X86 instruction sets. The function will be compiled and placed in Code Segment [5] area of the process image. So a function in C will just map to a memory location in Code Segment of the process image. For example calling a function 'print()', might be converted to something like 'jmp 0xFFDE123' (jmp is a branching instruction). So a function call is in fact as simple as executing a jump statement (of course you push the arguments into stack before you call jump).

Now lets consider a OOL like C++ which too compiles to hardware instruction set on compilation. Consider the below example of two classes which extend from a common class called Animal (not shown here) with one method talk. The below example just demonstrates how dynamic dispatch could be implemented by a compiler.

A word of caution before you proceed; the implementation is hypothetical and far from any practical use. The objective of the example shown below is just to give an idea of how Dynamic Dispatch might be implemented. Practical implementations are lot more complicated, so we have taken this simple example for easy explanation and understanding.

class Cat
{
    public:
        void talk() 
        {
             cout << "Meow!!";
        }
}; 

class Dog
{
    public:
        void talk() 
        {
             cout << "Bark!!";
        }
};

The above C++ program might infact be converted to an equivalent C program by the compiler during pre-compilation. Each newly defined structure gets an __id__ and this id will be used to lookup vtable (Virtual Method Table [6]) which maps id to the actual function memory location. One we get the memory location we will execute a jump instruction like we did before.

struct Cat
{
    int __id__; // 0
    char name[10];
};

struct Dog
{
    int __id__; // 1
    char name[10];
};

void cat_talk()
{
    printf("meow!");    
}

void dog_talk()
{
    printf("Bark!!");
}

void talk(void *object)
{
    int id = *(int *)object;
    vtable[id](); 
}
 
void (*vtable[2]);
vtable[0] = cat_talk;
vtable[1] = dog_talk;
 

// C++ function call like 
Animal *d = new Dog(); 
d->talk();

// might be changed like this by the compiler
Dog *d = malloc(sizeof(Dog));
d->__id__ = 1;
talk(d);

// talk function can use the __id__ attribute to offset into the vtable to identify which version of function to call.
Fig.1 - VTable Working


Virtual Method table or vtable or dispatch table [5] is the key to virtual function working. It is evident from the above that vtable infact makes the function calling little more expensive compared to normal function calls.

Performance Evaluation

From the above explanation it is evident that Virtual functions [7] [8] are more complicated that dispatching normal function calls. For example Virtual functions in C++ will need extra memory for each class to store the Virtual method table [5] and needs additional CPU cycles to perform vtable lookup, before it can make the actual function call.

This is one of the important reasons why not all functions are dynamically binded in C++. In C++, functions which are explicitly marked virtual are dynamically bound whereas remaining functions are statically bound during the compile time. When we say bound/bind we are actually talking about the mapping between the function name and memory location where it is stored for execution.

In languages like Java, the creators thought dynamic binding being an essential feature of OOL, hence, all functions are dynamically bound. It is also reasonable to assume that virtual functions might need 20 to 40 extra instruction sets to execute a function call. Given today's processors speed this might be very insignificant.

We performed a very small experiment to see if Dynamic dispatch is practically slower than static dispatch. We opted to choose C#, since this language provide both static binding and dynamic binding. We measured the time taken to execute 1000000000 dynamically dispatched functions and statically dispatched functions. Below are our observations.

Dynamic Dispatch:
00:00:10.0995777
00:00:10.4315966
00:00:10.4675987
Static Dispatch:
00:00:10.2565866
00:00:10.3255906
00:00:10.1575809

It looks like the performance overhead of dynamic binding is negligible in today's high performing computer. Even if it Dynamic Binding needs few more instruction to perform function call, the overhead in terms of time and memory is not much of a concern, but effective software design is a matter of concern. So we can conclude that Dynamic Binding does not cause serious performance problem in present day Computers.

Overriding and Overloading

Overriding [9] and Overloading [10] are two different features of OOL. In function overloading, which version of function to call is decided by the compiler during the compile time itself, where as when a function is overridden, which version of the function to call is deferred till runtime.

Let us see an two examples which provides insight into Overriding and Overloading.

Example - 1: Overloading

 public class A{
   public boolean equals( A check){                  # Equals method of parameter type A
     System.out.println( "A.equals method called");
     return true;
   }
 
   public static void main(String [] args){          # Main method
     Object a1 = new A();                            # Instantiate class to get objects
     A a2 = new A();
     Object o1 = new Object();

     a1.equals(a2);                                  # Object equals method called
     a2.equals(a1);                                  # Object equals method called
     a2.equals(a2);                                  # A equals method called
     a2.equals(o1);                                  # Object equals method called
   }
 }

Example - 2: Overriding

 public class A{
   @Override
   public boolean equals( Object check){             # Equals method of parameter type Object
     System.out.println( "A.equals method called");
     return true;
   }
 
   public static void main(String [] args){          # Main method
     Object a1 = new A();                            # Instantiate class to get objects
     A a2 = new A();
     Object o1 = new Object();

     a1.equals(a2);                                  # A equals method called
     a2.equals(a1);                                  # A equals method called
     a2.equals(a2);                                  # A equals method called
     a2.equals(o1);                                  # A equals method called
   }
 }


If you see the comments provided to the side of the method calls, you would note the difference in the equals method called in case of both Overloading and Overriding. The decision which method to use, basically has two phases: First overload resolution, then method dispatch. Overload resolution happens at compile-time, method dispatch at runtime.

Does Dynamic Dispatching hurts in today's power packed Computers?

Even though Dynamic Dispatching is done in run-time, and it takes good amount of memory, today's computers are power packed with good amount of memory. Hence efficiency concerns for implementing Dynamic Dispatching in OOL is of little concern.

However, the method to be called cannot be chosen based on the actual arguments passed to the function, rather, the method is called based on the declared type of the parameters.

Advantages of Dynamic Binding

Dynamic binding has several advantages. It provides tremendous flexibilities. Also, it allows the software to be malleable when requirements change as the system evolves. Because of dynamic binding, caller objects are not concerned how the invoked objects carry out their methods. All they need to know is that the invoked objects know how to carry out their responsibilities, but they themselves need to know only what the invoked objects can do for them. As a consequence of this, type dependencies (which are the bane of procedural programming) cannot have a ripple effect through the system, when system requirements change. The beauty is that such dependencies remain encapsulated within the objects. The flexibility that developers can derive out of this is enormous. For instance, one can install newer types without having to change or stopping the functioning of existing systems. This is something along the lines of "hot pluggable components" of the hardware cousins.

One other advantage of Dynamic binding based on parameter types is that more than one parameter can be used in the selection of a method. Methods that use dynamic binding in this way are called multi-methods and the concept is called Multiple Dispatch [11].

Multiple Dispatch

The Multiple Dispatch is used in Common Lisp Object System (CLOS) [12]. Multiple-polymorphism allows specialized functions or methods to be defined to handle various cases:

 +(int, int)
 +(int, float)
 +(int, complex) .. etc

The above functions are specialized to each of the cases required allowing single, highly cohesive and loosely coupled functions to be defined. This is also the true essence of object-oriented polymorphism, which allows objects to define methods for each specific case desired. In addition to better coupling and cohesion, multiple-polymorphism reduces program complexity by avoiding coding logic (switch statements) and because small methods further reduce complexity, as code complexity doesn't grow linearly with lines of code per method, but perhaps exponentially.

Conclusion

Hence from the above discussion, we can see that Dynamic Dispatch, though slower when compared to static dispatch is no longer slower due to the new generation of power packed computers. Also, dynamic dispatch can help in preserving the object-oriented concepts in highly complex software products. We have also looked up the differences between method Overloading and Overriding. We also had a sneak-view to Multiple Dispatch where more than one parameter can be used for Dynamic method lookup.


References

  1. Object Oriented Programming. en.wikipedia.org. Retrieved Oct 31, 2010.
  2. Method Overriding. en.wikipedia.org. Retrieved Oct 31, 2010.
  3. Dynamic Polymorphism. en.wikipedia.org. Retrieved Oct 31, 2010.
  4. Code Segment. en.wikipedia.org. Retrieved Oct 31, 2010.
  5. Virtual Method Table. en.wikipedia.org. Retrieved Oct 31, 2010.
  6. Virtual Functions. en.wikipedia.org. Retrieved Oct 31, 2010.
  7. Virtual Functions in C++. publib.boulder.ibm.com. Retrieved Oct 31, 2010.
  8. Method Overloading. en.wikipedia.org. Retrieved Oct 31, 2010.
  9. Multiple Dispatch. en.wikipedia.org. Retrieved Oct 31, 2010.