CSC/ECE 517 Fall 2012/ch1b 1w47 sk: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(39 intermediate revisions by the same user not shown)
Line 1: Line 1:
<p> '''Generics''' </p>
==Introduction==
==Introduction==
Generic Programming is a programming paradigm for developing efficient and reusable software libraries. The Generic Programming process focuses on finding commonality among similar implementations of the same algorithm, then providing suitable abstractions so that a single, generic algorithm can cover many concrete implementations. This process is repeated until the generic algorithm has reached a suitable level of abstraction, where it provides maximal re-usability while still yielding efficient, concrete implementations.
Generic Programming is a programming paradigm for developing efficient and reusable software libraries. The Generic Programming process focuses on finding commonality among similar implementations of the same algorithm, then providing suitable abstractions so that a single, generic algorithm can cover many concrete implementations. This process is repeated until the generic algorithm has reached a suitable level of abstraction, where it provides maximal re-usability while still yielding efficient, concrete implementations.
Line 143: Line 145:


* Visual C++, C#, and Visual Basic all provide full support for defining and using generics.
* Visual C++, C#, and Visual Basic all provide full support for defining and using generics.
==== Advantages and Disadvanatges ====
Generics in .NET 2.0 are very powerful. They allow developers to write code without committing to a particular type, yet your code can enjoy type safety. Generics are implemented in such a way as to provide good performance and avoid code bloat. While there is the drawback of constraints' inability to specify that a type must be a base type of another type, the constraints mechanism gives us the flexibility to write code with a greater degree of freedom than sticking with the least-common-denominator capability of all types.
<br>
<br>
<br>


Line 285: Line 291:
Class.getTypeParameters() suggest that one might be able to find out what the parameter types are. But in reality, the identifiers that are used as the parameter placeholders. This is one important feature where Java differs from C++. Java generics are implemented using erasure.
Class.getTypeParameters() suggest that one might be able to find out what the parameter types are. But in reality, the identifiers that are used as the parameter placeholders. This is one important feature where Java differs from C++. Java generics are implemented using erasure.
So when one uses generics, the specific information is erased. The only thing one knows is that we are using an object. So both List<String> and List< Integer> have the same type at runtime.
So when one uses generics, the specific information is erased. The only thing one knows is that we are using an object. So both List<String> and List< Integer> have the same type at runtime.
 
<br>
==== Arrays of Generics ====
 
Its a good idea to use ArrayList when you need to create an array of generics. This behaves as an array so its a nice way to go around this problem.
<pre>
public class Array_Of_Generics<T> {
private List<T> arr = new ArrayList<T>();
public void add(T item) { arr.add(item); }
public T get(int index) { return arr.get(index); }
}
</pre>
 
If you still want to create an array of generics.  You can use the below snippet but still you cannot create array of the exact type.
 
<pre>
class Generic<T> {}
public class Array_Of_Generics_B {
static Generic<Integer>[] a;
}
</pre>


==== Advantages and Disadvantages ====
==== Advantages and Disadvantages ====


These are there pretty much for syntactic sugar. They are implemented through a controversial decision called type erasure. All they really do is prevent you from having to cast a whole lot, which makes them safer to use. Performance is identical to making specialized classes, except in cases where you are using what would have been a raw data type (int, float, double, char, bool, short). In these cases, the value types must be boxed to their corresponding reference types (Integer, Float, Double, Char, Bool, Short), which has some overhead. Memory usage is identical, since the JRE is just performing the casting in the background (which is essentially free). Java also has some nice type covariance and contravariance, which makes things look much cleaner than not using them.
With Java generics, we don't actually get any of the execution efficiency, because when one compiles a generic class in Java, the compiler takes away the type parameter and substitutes Object everywhere. So the compiled image for List<T> is like a List where it uses the type Object everywhere. Of course, if you now try to make a List<int>, you get boxing of all the ints. So there's a bunch of overhead there. Furthermore, to keep the VM happy, the compiler actually has to insert all of the type casts that were not written by the programmer. If it's a List of Object and we're trying to treat those Objects as Customers, at some point the Objects must be cast to Customers to keep the verifier happy. Java in their implementation does this by automatically inserting those type casts for you. Though this simplifies the programmer's job and avoids cluttering in code, there is no efficiency attained out of this optimization. Java's generics implementation relies on erasure of the type parameter, when we get to runtime, we don't actually have a faithful representation of what we had at compile time. When we apply reflection to a generic List in Java, we can't tell what the List is a List of


.
<br>


=== Generics in C++===
=== Generics in C++===


In C++ generics are called templates.Function templates are special functions that can operate with generic types. This allows us to create a function template whose functionality can be adapted to more than one type or class without repeating the entire code for each type.
C++ templates are a powerful mechanism for code reuse, as they enable the programmer to write code that behaves the same for data of any type.
 
==== Defining Templates ====
A Template Definition starts with the keyword template, followed by a list of Template Parameters. What follows is then either a class definition, or a function definition, defining a class template or a function template respectively. The template parameters introduce names into the scope of the definition, which can be types, values or templates. These names can be used just like any other name of the same kind. Then, when the template is instantiated, real types, values or templates are substituted in place of these names, and the code compiled.


In C++ this can be achieved using template parameters. A template parameter is a special kind of parameter that can be used to pass a type as argument: just like regular function parameters can be used to pass values to a function, template parameters allow to pass also types to a function. These function templates can use these parameters as if they were any other regular type.
Two template instantiations refer to the same template if their parameters are all the same, irrespective of any typedefs that may apply. Therefore vec1, vec2 and vec3 in the following example are all the same type.


The format for declaring function templates with type parameters is:
<pre>
<pre>
template <class identifier> function_declaration;
typedef std::string MyString;
template <typename identifier> function_declaration;
typedef std::vector<std::string> T1;
typedef std::vector<MyString> T2;
T1 vec1;
T2 vec2;
std::vector<std::string> vec3;
</pre>
</pre>
The only difference between both prototypes is the use of either the keyword class or the keyword typename. Its use is indistinct, since both expressions have exactly the same meaning and behave exactly the same way.
For example, to create a template function that returns the greater one of two objects we could use:


Multiple parameters may be specified, separated by commas in the template parameter list:
<pre>
<pre>
template <class myType>
template<typename T1,typename T2>
myType GetMax (myType a, myType b) {
class MyClass{};
return (a>b?a:b);
}
</pre>
</pre>


Here we have created a template function with myType as its template parameter. This template parameter represents a type that has not yet been specified, but that can be used in the template function as if it were a regular type. As you can see, the function template GetMax returns the greater of two parameters of this still-undefined type.
MyClass is thus a class template with two template type parameters, T1 and T2. Every time a template is referenced with distinct template arguments, then the template is instantiated with the arguments. If the resultant code is not valid, then a compilation error will occur.


To use this function template we use the following format for the function call:
Templates can have one or more template parameters, which can be Type parameters, Non-Type parameters, or Template parameters.


<pre>
===== Template Type Parameters=====
function_name <type> (parameters);
Template Type Parameters are template parameters that refer to a type; they are the most common form of template parameters.  The syntax for Template Type Parameters is as below:
</pre>


For example, to call GetMax to compare two integer values of type int we can write:
<pre>
<pre>
int x,y;
typename name
GetMax <int> (x,y);
or
class name
</pre>
</pre>


When the compiler encounters this call to a template function, it uses the template to automatically generate a function replacing each appearance of myType by the type passed as the actual template parameter (int in this case) and then calls it. This process is automatically performed by the compiler and is invisible to the programmer.
Both alternatives are identical in meaning, and the choice is merely a matter of style. name is any valid C++ symbol name. Once name has been introduced in the template parameter list, then any reference to name within the body of the template automatically refers to the type of the corresponding template argument for each instantiation, and can be used anywhere a type can normally be used. e.g.


Here is the entire example:
<pre>
<pre>
// function template
template<typename T>
#include <iostream>
void func(T value)
using namespace std;
{
 
const T& ref=value;
template <class T>
T* p=new T;
T GetMax (T a, T b) {
T temp(23);
  T result;
  result = (a>b)? a : b;
  return (result);
}
 
int main () {
  int i=5, j=6, k;
  long l=10, m=5, n;
  k=GetMax<int>(i,j);
  n=GetMax<long>(l,m);
  cout << k << endl;
  cout << n << endl;
  return 0;
}
}
</pre>
</pre>


In this case, we have used T as the template parameter name instead of myType because it is shorter and in fact is a very common template parameter name. But you can use any identifier you like. In the example above we used the function template GetMax() twice. The first time with arguments of type int and the second one with arguments of type long. The compiler has instantiated and then called each time the appropriate version of the function.
The code generated for func<int> is then identical to:
 
As you can see, the type T is used within the GetMax() template function even to declare new objects of that type:
 
<pre>
T result;
</pre>
 
Therefore, result will be an object of the same type as the parameters a and b when the function template is instantiated with a specific type.
 
In this specific case where the generic type T is used as a parameter for GetMax the compiler can find out automatically which data type has to instantiate without having to explicitly specify it within angle brackets (like we have done before specifying <int> and <long>). So we could have written instead:
 
<pre>
int i,j;
GetMax (i,j);
</pre>
 
Since both i and j are of type int, and the compiler can automatically find out that the template parameter can only be int. This implicit method produces exactly the same result:


<pre>
<pre>
// function template II
void func(int value)
#include <iostream>
{
using namespace std;
const int& ref=value;
 
int* p=new int;
template <class T>
int temp(23);
T GetMax (T a, T b) {
  return (a>b?a:b);
}
 
int main () {
  int i=5, j=6, k;
  long l=10, m=5, n;
  k=GetMax(i,j);
  n=GetMax(l,m);
  cout << k << endl;
  cout << n << endl;
  return 0;
}
}
</pre>
</pre>


Notice how in this case, we called our function template GetMax() without explicitly specifying the type between angle-brackets <>. The compiler automatically determines what type is needed on each call.
If, however, a reference was made to func<std::string>, then a compilation error would result, as the statement std::string temp(23); is not valid.


Because our template function includes only one template parameter (class T) and the function template itself accepts two parameters, both of this T type, we cannot call our function template with two objects of different types as arguments:
===== Template Non-Type Parameters =====
Non-Type Template Parameters are template parameters that are values rather than types. They can be any value that is a compile-time constant. Their syntax is akin to a variable declaration, e.g.:


<pre>
<pre>
int i;
template<int i>
long l;
class A{};
k = GetMax (i,l);
template<double* dp>
class B{};
template<void (*func)(int)>
void c()
{}
</pre>
</pre>


This would not be correct, since our GetMax function template expects two arguments of the same type, and in this call to it we use objects of two different types.
Class template A has a non-type template parameter which is an integer, class template B has a non-type template parameter which is a pointer-to-double, and function template c has a non-type template parameter which is a pointer to a function returning void, with a single int parameter.  


We can also define function templates that accept more than one type parameter, simply by specifying more template parameters between the angle brackets. For example:
===== Template Template Parameters =====
Template Template Parameters enable a template to be parameterized by the name of another template. Say, for example, there is a class which contains a couple of collections of items, some strings, some integers, and one wants the users of the class to choose what type of collection to use (vector, list, stack, etc.). The natural thought is to make the collection type a template parameter. However, a collection of strings is a different type from a collection of integers, so the user would have to specify both individually if template type parameters are used. The solution is a Template Template Parameter which is shown in the code below:


<pre>
<pre>
template <class T, class U>
template<template<typename T> class ContainerType>
T GetMin (T a, U b) {
class MyClass
  return (a<b?a:b);
{
}
ContainerType<int> intContainer;
ContainerType<std::string> stringContainer;
// rest of class
};
</pre>
</pre>


In this case, our function template GetMin() accepts two parameters of different types and returns an object of the same type as the first parameter (T) that is passed. For example, after that declaration we could call GetMin() with:
ContainerType is a Template Template Parameter that refers to a template with a single Template Type Parameter. One can thus say MyClass<vector> or MyClass<list> to have vectors or lists respectively.Within the template definition, the Template Template Parameters can be used just like any other template.


==== Using Templates ====
Class templates and function templates can be used anywhere normal classes and functions can be, as well as as Template Template Parameters  to other templates. However, the compiler needs to know what to use for the template parameters. For class templates, the template name must be followed by a template argument list, specifying the parameters. This is a comma-separated list of expressions between angle brackets (<>). For Template Type Parameters, the corresponding expression must name a type, for Template Non-Type Parameters, the expression must evaluate
to a compile-time constant of the appropriate type, and for Template Template Parameters, the expression must name a template with the correct signature. e.g.:
<pre>
<pre>
int i,j;
template<typename T,unsigned i>
long l;
struct FixedArray
i = GetMin<int,long> (j,l);
{
</pre>
T data[i];
 
};
or simply:
FixedArray<int,3> a; // array of 3 integers
 
FixedArray<int,1+6/3> b; // array of 3 integers
<pre>  
template<template<typename T,typename Allocator> class Container>
i = GetMin (j,l);
struct ContainerPair
{
Container<int,std::allocator<int> > intContainer;
Container<std::string,std::allocator<std::string> > stringContainer;
};
ContainerPair<std::deque> deqCont; // two std::deques
ContainerPair<std::vector> vecCont; // two std::vectors
</pre>
</pre>
The second example demonstrates two things. Firstly, the template signature of the Template Template Parameter must exactly match the signature of the template passed as the argument, and standard containers have two template parameters. Secondly, if the last argument of a template parameter list is a template reference, as for Container<int,std::allocator<int> >, then the two closing angle brackets (>) must be separated by a space, to avoid being interpreted as the shift right operator (>>).


even though j and l have different types, since the compiler can determine the appropriate instantiation anyway.
==== Templates requirements and concepts ====
 
Class templates
We also have the possibility to write class templates, so that a class can have members that use template parameters as types. For example:
 
<pre>
template <class T>
class mypair {
    T values [2];
  public:
    mypair (T first, T second)
    {
      values[0]=first; values[1]=second;
    }
};
</pre>


The class that we have just defined serves to store two elements of any valid type. For example, if we wanted to declare an object of this class to store two integer values of type int with the values 115 and 36 we would write:
Templates implicitly impose requirements on their parameters, particularly Template Type Parameters and Template Template Parameters. These requirements generally take the form of operations that must work on objects of the appropriate type, or class members that must exist and refer to objects or types or functions (with the implicit extra requirement that the type referred to is a class). A Concept is a set of requirements that describe a useful feature of a type. For example, the C++ Standard Library makes reference to things being Assignable or Copy-Constructible; these are Concepts that make the following requirements:


<pre>
* Given a type T, that type is Copy-Constructible if the expression T a(b); is defined, where b is an expression of type T, and the resultant object a has a value equivalent to the value of the expression b. That type is Assignable if the expression a=b; is defined, where a and b are expressions of type T, and the value of the object referred to by the expression a is equivalent to the value of the object referred to by the expression b after the assignment.
mypair<int> myobject (115, 36);  
</pre>


this same class would also be used to create an object to store any other type:
* All built-in types are both Copy-Constructible and Assignable. For classes, these requirements translate into requirements on member functions.


<pre>
* A class T is Copy-Constructible if it has a constructor which can be called with one argument of type const-reference-to-T. This may be a constructor with one argument, or it may be a constructor with more than one argument, with default values provided for the remaining arguments. The object thus constructed should have a value equivalent to that of the argument.
mypair<double> myfloats (3.0, 2.18);
</pre>


The only member function in the previous class template has been defined inline within the class declaration itself. In case that we define a function member outside the declaration of the class template, we must always precede that definition with the template <...> prefix:
* A class T is Assignable if it defines a copy-assignment operator (operator=()) which has an argument of type T, or const-reference-to-T. The object to which the member function belongs should have a value equivalent to that of the argument after the completion of such a member function.


==== Template Specialization and Overloading ====
Template Specialization allows one to decide that for a specific set of template parameters, the code instantiated for a template should be different to the general case. Say, for example, one wish to use a template with a new class as a Template Type Parameter, and though it has the same semantic behaviour as the classes the
template was designed for, the syntax is different, so the original template won’t compile with the new class as a parameter. The solution to this problem is specialization. Consider the following template function definition:
<pre>
<pre>
// class templates
template<typename T>
#include <iostream>
void func(T value)
using namespace std;
 
template <class T>
class mypair {
    T a, b;
  public:
    mypair (T first, T second)
      {a=first; b=second;}
    T getmax ();
};
 
template <class T>
T mypair<T>::getmax ()
{
{
  T retval;
value.add(3);
  retval = a>b? a : b;
  return retval;
}
}
int main () {
  mypair <int> myobject (100, 75);
  cout << myobject.getmax();
  return 0;
}
</pre>
Notice the syntax of the definition of member function getmax:
<pre>
template <class T>
T mypair<T>::getmax ()
</pre>
</pre>
 
This function assumes that the type substituted for the Template Type Parameter T is a class with a member function add that takes a single parameter of a type that can be constructed from an int. Thus the following classes would be OK:
Confused by so many T's? There are three T's in this declaration: The first one is the template parameter. The second T refers to the type returned by the function. And the third T (the one between angle brackets) is also a requirement: It specifies that this function's template parameter is also the class template parameter.
 
Template specialization
If we want to define a different implementation for a template when a specific type is passed as template parameter, we can declare a specialization of that template.
 
For example, let's suppose that we have a very simple class called mycontainer that can store one element of any type and that it has just one member function called increase, which increases its value. But we find that when it stores an element of type char it would be more convenient to have a completely different implementation with a function member uppercase, so we decide to declare a class template specialization for that type:
 
<pre>
<pre>
// template specialization
class T1
#include <iostream>
{
using namespace std;
public:
 
void add(int i,double d=0.0);
// class template:
template <class T>
class mycontainer {
    T element;
  public:
    mycontainer (T arg) {element=arg;}
    T increase () {return ++element;}
};
};
 
class T2
// class template specialization:
{
template <>
public:
class mycontainer <char> {
std::string add(double d);
    char element;
  public:
    mycontainer (char arg) {element=arg;}
    char uppercase ()
    {
      if ((element>='a')&&(element<='z'))
      element+='A'-'a';
      return element;
    }
};
};
int main () {
  mycontainer<int> myint (7);
  mycontainer<char> mychar ('j');
  cout << myint.increase() << endl;
  cout << mychar.uppercase() << endl;
  return 0;
}
</pre>
This is the syntax used in the class template specialization:
<pre>
template <> class mycontainer <char> { ... };
</pre>
</pre>


First of all, notice that we precede the class template name with an empty template<> parameter list. This is to explicitly declare it as a template specialization.
However, the class T3 below is not OK, because Add has a capital A.
 
But more important than this prefix, is the <char> specialization parameter after the class template name. This specialization parameter itself identifies the type for which we are going to declare a template class specialization (char). Notice the differences between the generic class template and the specialization:
 
<pre>
<pre>
template <class T> class mycontainer { ... };
class T3
template <> class mycontainer <char> { ... };
{
</pre>
public:
 
void Add(int i);
The first line is the generic template, and the second one is the specialization.
 
When we declare specializations for a template class, we must also define all its members, even those exactly equal to the generic template class, because there is no "inheritance" of members from the generic template to the specialization.
 
Non-type parameters for templates
Besides the template arguments that are preceded by the class or typename keywords , which represent types, templates can also have regular typed parameters, similar to those found in functions. As an example, have a look at this class template that is used to contain sequences of elements:
 
<pre>
// sequence template
#include <iostream>
using namespace std;
 
template <class T, int N>
class mysequence {
    T memblock [N];
  public:
    void setmember (int x, T value);
    T getmember (int x);
};
};
template <class T, int N>
void mysequence<T,N>::setmember (int x, T value) {
  memblock[x]=value;
}
template <class T, int N>
T mysequence<T,N>::getmember (int x) {
  return memblock[x];
}
int main () {
  mysequence <int,5> myints;
  mysequence <double,5> myfloats;
  myints.setmember (0,100);
  myfloats.setmember (3,3.1416);
  cout << myints.getmember(0) << '\n';
  cout << myfloats.getmember(3) << '\n';
  return 0;
}
100
3.1416
</pre>
</pre>
This means that one cannot use the original function template func for T3.


It is also possible to set default values or types for class template parameters. For example, if the previous class template definition had been:
==== Advantages and Disadvanatges ====
<pre>
template <class T=char, int N=10> class mysequence {..};
</pre>
We could create objects using the default template parameters by declaring:
<pre>
mysequence<> myseq;
</pre>
 
Which would be equivalent to:
<pre>
mysequence<char,10> myseq;
</pre>
 
==== Advantages and Disadvantages ====
 
These actually generate different classes based on the input type. An std::vector<int> is a completely different class than an std::vector<float>. There is no support for covariance or contravariance, but there is support for passing non-types to templates, partial template specialization. They basically allow you to do whatever you want.
 
However, since C++ templates create different classes for every variation of their template parameters, the size of the compiled executable is larger. Beyond that, compilation time increases greatly, since all template code must be included with each compilation unit and much more code must be generated. However, actual runtime memory footprint is typically smaller than the alternative (frees an extra void*) and performance is better, since the compiler can perform more aggressive optimizations with the known type.
 
While a generic Java class compiles it's entire self, when using a C++ template, you only compile what you use. So, if you create an std::vector<int> and only use push_back and size, only those functions will be compiled into the object file. This eases the size of executable problem.
 
==  Comparing C# and Java Generics ==
 
Java's generics implementation was based on a project originally called Pizza, which was done by Martin Odersky and others. Pizza was renamed GJ, then it turned into a JSR and ended up being adopted into the Java language. And this particular generics proposal had as a key design goal that it could run on an unmodified VM [Virtual Machine]. It is, of course, great that you don't have to modify your VM, but it also brings about a whole bunch of odd limitations. The limitations are not necessarily directly apparent, but you very quickly go, "Hmm, that's strange."
 
For example, with Java generics, you don't actually get any of the execution efficiency that I talked about, because when you compile a generic class in Java, the compiler takes away the type parameter and substitutes Object everywhere. So the compiled image for List<T> is like a List where you use the type Object everywhere. Of course, if you now try to make a List<int>, you get boxing of all the ints. So there's a bunch of overhead there. Furthermore, to keep the VM happy, the compiler actually has to insert all of the type casts you didn't write. If it's a List of Object and you're trying to treat those Objects as Customers, at some point the Objects must be cast to Customers to keep the verifier happy. And really all they're doing in their implementation is automatically inserting those type casts for you. So you get the syntactic sugar, or some of it at least, but you don't get any of the execution efficiency. So that's issue number one I have with Java's solution.
 
Issue number two, and I think this is probably an even bigger issue, is that because Java's generics implementation relies on erasure of the type parameter, when you get to runtime, you don't actually have a faithful representation of what you had at compile time. When you apply reflection to a generic List in Java, you can't tell what the List is a List of. It's just a List. Because you've lost the type information, any type of dynamic code-generation scenario, or reflection-based scenario, simply doesn't work. If there's one trend that's pretty clear to me, it's that there's more and more of that. And it just doesn't work, because you've lost the type information. Whereas in our implementation, all of that information is available. You can use reflection to get the System.Type for object List<T>. You cannot actually create an instance of it yet, because you don't know what T is. But then you can use reflection to get the System.Type for int. You can then ask reflection to please put these two together and create a List<int>, and you get another System.Type for List<int>. So representationally, anything you can do at compile time you can also do at runtime.
 
== Comparing C# Generics to C++ Templates ==


To me the best way to understand the distinction between C# generics and C++ templates is this: C# generics are really just like classes, except they have a type parameter. C++ templates are really just like macros, except they look like classes.
Generics generate different classes based on the input type. An std::vector<int> is a completely different class than an std::vector<float>. However, since C++ templates create different classes for every variation of their template parameters, the size of the compiled executable is larger. Beyond that, compilation time increases greatly, since all template code must be included with each compilation unit and much more code must be generated. However, actual runtime memory footprint is typically smaller than the alternative (frees an extra void*) and performance is better, since the compiler can perform more aggressive optimizations with the known type. While a generic Java class compiles it's entire self, when using a C++ template, you only compile what you use. So, if you create an std::vector<int> and only use push_back and size, only those functions will be compiled into the object file. This eases the size of executable problem.


The big difference between C# generics and C++ templates shows up in when the type checking occurs and how the instantiation occurs. First of all, C# does the instantiation at runtime. C++ does it at compile time, or perhaps at link time. But regardless, the instantiation happens in C++ before the program runs. That's difference number one. Difference number two is C# does strong type checking when you compile the generic type. For an unconstrained type parameter, like List<T>, the only methods available on values of type T are those that are found on type Object, because those are the only methods we can generally guarantee will exist. So in C# generics, we guarantee that any operation you do on a type parameter will succeed.
== Conclusion ==


C++ is the opposite. In C++, you can do anything you damn well please on a variable of a type parameter type. But then once you instantiate it, it may not work, and you'll get some cryptic error messages. For example, if you have a type parameter T, and variables x and y of type T, and you say x + y, well you had better have an operator+ defined for + of two Ts, or you'll get some cryptic error message. So in a sense, C++ templates are actually untyped, or loosely typed. Whereas C# generics are strongly typed.
Generics in all languages have the common advantage of saving developer's effort on type casting of individual types and generalizes functionality irrespective of individual data types hence reducing code duplication, code cluttering and improving code re-usability. Compilation and generation of final executable follows a different path in C++ and Java. In Java, all the generic classes of different types are all considered the same with type Object, while in C++ each of the templates is a different class. As a result, in Java, there is not increase in the size of executable files, but there is no improvement in runtime efficiency as well. In C++, however, the size of the executable is much large, but there is improved efficiency.


==References==
==References==
Line 662: Line 473:
4. http://www.cplusplus.com/doc/tutorial/templates <br>
4. http://www.cplusplus.com/doc/tutorial/templates <br>
5. http://osl.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf <br>
5. http://osl.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf <br>
6. http://msdn.microsoft.com/en-us/library/ms172192.aspx#Y1524
6. http://msdn.microsoft.com/en-us/library/ms172192.aspx#Y1524 <br>
7. http://www.justsoftwaresolutions.co.uk/articles/intrototemplates.pdf<br>
8. http://www.sgi.com/tech/stl/stl_introduction.html

Latest revision as of 03:05, 2 October 2012

Generics

Introduction

Generic Programming is a programming paradigm for developing efficient and reusable software libraries. The Generic Programming process focuses on finding commonality among similar implementations of the same algorithm, then providing suitable abstractions so that a single, generic algorithm can cover many concrete implementations. This process is repeated until the generic algorithm has reached a suitable level of abstraction, where it provides maximal re-usability while still yielding efficient, concrete implementations.

In the simplest definition, generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by Ada in 1983, permits writing common functions or types that differ only in the set of types on which they operate on, thus reducing duplication. Such software entities are known as generics in Ada, Eiffel, Java and .NET; parametric polymorphism in ML, Scala and Haskell ; templates in C++; and parameterized types in the influential 1994 book, Design Patterns.

The term generic programming was originally coined by David Musser and Alexander Stepanov in a more specific sense than the above, to describe an approach to software decomposition whereby fundamental requirements on types are abstracted from across concrete examples of algorithms and data structures and formalized as concepts, analogously to the abstraction of algebraic theories in abstract algebra. Early examples of this programming approach were implemented in Scheme and Ada, although the best known example is the Standard Template Library (STL) in which is developed a theory of iterators which is used to decouple sequence data structures and the algorithms operating on them.

Generics in Object Oriented Languages

Generics .NET framework

Generics are classes, structures, interfaces, and methods that have placeholders (type parameters) for one or more of the types that they store or use. A generic collection class might use a type parameter as a placeholder for the type of objects that it stores; the type parameters appear as the types of its fields and the parameter types of its methods. A generic method might use its type parameter as the type of its return value or as the type of one of its formal parameters.

The following code illustrates a simple generic class definition in C#.

public class Generic<T>
{
    public T Field;
}

The following code illustrates a simple generic class definition in Visual Basic.

Public Class Generic(Of T)
    Public Field As T
End Class

When an instance of a generic class is created, one can specify the actual types to substitute for the type parameters. This establishes a new generic class, referred to as a constructed generic class, with the chosen types substituted everywhere that the type parameters appear. The result is a type-safe class that is tailored to the user's choice of types, which is illustrated in the code below in C# and VB.

//Code in C#
public static void Main()
{
    Generic<string> g = new Generic<string>();
    g.Field = "A string";
}
'Code in Visual Basic
Public Shared Sub Main()
    Dim g As New Generic(Of String)
    g.Field = "A string" 
End Sub

Generics Terminology in .NET

  • A generic type definition is a class, structure, or interface declaration that functions as a template, with placeholders for the types that it can contain or use. For example, the Dictionary class can contain two types: keys and values. Because a generic type definition is only a template, so one cannot create instances of a class, structure, or interface that is a generic type definition.
  • Generic type parameters, or type parameters, are the placeholders in a generic type or method definition. The Dictionary generic type has two type parameters, TKey and TValue, that represent the types of its keys and values.
  • A constructed generic type, or constructed type, is the result of specifying types for the generic type parameters of a generic type definition.
  • A generic type argument is any type that is substituted for a generic type parameter.
  • The general term generic type includes both constructed types and generic type definitions.
  • Covariance and contravariance of generic type parameters enable one to use constructed generic types whose type arguments are more derived (covariance) or less derived (contravariance) than a target constructed type. Covariance and contravariance are collectively referred to as variance.
  • Constraints are limits placed on generic type parameters. For example, one might limit a type parameter to types that implement the IComparer generic interface, to ensure that instances of the type can be ordered. One can also constrain type parameters to types that have a particular base class, that have a default constructor, or that are reference types or value types. Users of the generic type cannot substitute type arguments that do not satisfy the constraints.
  • A generic method definition is a method with two parameter lists: a list of generic type parameters and a list of formal parameters. Type parameters can appear as the return type or as the types of the formal parameters. The following code shows this in C# and Visual Basic.
//Code in C#
T Generic<T>(T arg)
{
    T temp = arg;
    //... 
    return temp;
}
'Code in Visual Basic
Function Generic(Of T)(ByVal arg As T) As T
    Dim temp As T = arg
    '... 
    Return temp
End Function
  • Generic methods can appear on generic or nongeneric types. A method is not generic just because it belongs to a generic type, or even because it has formal parameters whose types are the generic parameters of the enclosing type. A method is generic only if it has its own list of type parameters. In the following code, only method G is generic.
//Code in C#
class A
{
    T G<T>(T arg)
    {
        T temp = arg;
        //... 
        return temp;
    }
}
class Generic<T>
{
    T M(T arg)
    {
        T temp = arg;
        //... 
        return temp;
    }
}
'Code in Visual Basic
Class A
    Function G(Of T)(ByVal arg As T) As T
        Dim temp As T = arg
        '... 
        Return temp
    End Function 
End Class 
Class Generic(Of T)
    Function M(ByVal arg As T) As T
        Dim temp As T = arg
        '... 
        Return temp
    End Function 
End Class

Nested Types and Generics

A type that is nested in a generic type can depend on the type parameters of the enclosing generic type. The common language runtime considers nested types to be generic, even if they do not have generic type parameters of their own. When you create an instance of a nested type, you must specify type arguments for all enclosing generic types.

Language Support

The .NET Framework provides a number of generic collection classes in the following namespaces:

  • The System.Collections.Generic namespace catalogs most of the generic collection types provided by the .NET Framework, such as the List and Dictionary generic classes.
  • The System.Collections.ObjectModel namespace catalogs additional generic collection types, such as the ReadOnlyCollection generic class, that are useful for exposing object models to users of your classes.
  • Generic interfaces for implementing sort and equality comparisons are provided in the System namespace, along with generic delegate types for event handlers, conversions, and search predicates.
  • The common language runtime provides new opcodes and prefixes to support generic types in Microsoft intermediate language (MSIL), including Stelem, Ldelem, Unbox_Any, Constrained, and Readonly.
  • Visual C++, C#, and Visual Basic all provide full support for defining and using generics.

Advantages and Disadvanatges

Generics in .NET 2.0 are very powerful. They allow developers to write code without committing to a particular type, yet your code can enjoy type safety. Generics are implemented in such a way as to provide good performance and avoid code bloat. While there is the drawback of constraints' inability to specify that a type must be a base type of another type, the constraints mechanism gives us the flexibility to write code with a greater degree of freedom than sticking with the least-common-denominator capability of all types.

Generics in Java

Generics were introduced in Java from JDK 1.5. Generics allows the abstraction over types. The most common examples are container types, such as those in the Collection hierarchy. So, if one could say that the code works with some unspecified type, rather than a specific interface or class, then the code uses Generics. Below is an example of a Genric:

List myIntList = new LinkedList(); 
myIntList.add(new Integer(0)); 
Integer x = (Integer) myIntList.iterator().next(); 

Typically, a programmer knows what kind of data has been placed into a particular list. However, the cast is essential. The compiler can only guarantee that an Object will be returned by the iterator. To ensure the assignment to a variable of type Integer is type safe, the cast is required. Of course, the cast not only introduces clutter, but also introduces the possibility of a run time error, since the programmer might be mistaken. It would be better if programmers could actually express their intent, and mark a list as being restricted to contain a particular data type. This is the core idea behind generics.

Here is a variation of the program fragment given above using generics:

List<Integer> myIntList = new LinkedList<Integer>();
myIntList.add(new Integer(0)); 
Integer x = myIntList.iterator().next();

The type declaration for the variable myIntList specifies that this is not just an arbitrary List, but a List of Integer, written List<Integer>. The List is a generic interface that takes a type parameter - in this case, Integer. One has to specify a type parameter when creating the list object. The compiler can now check the type correctness of the program at compile-time. Because myIntList is declared with type List<Integer>, this indicates that myIntList holds a list of integers, which holds true wherever and whenever it is used, and the compiler ensures to guarantee it. In contrast, the cast tells something the programmer thinks is true at a single point in the code. The net effect, especially in large programs, is improved readability and robustness.

Below is code for generic interfaces List and Iterator in package java.util.The content in angle brackets are the declarations of the formal type parameters of the interfaces List and Iterator.

public interface List<E> {
void add(E x);
Iterator<E> iterator();
}
public interface Iterator<E> {
E next();
boolean hasNext();
}

Generics and Subtyping

Below is an example showing code for Generics and Subtyping.

List<String> ls = new ArrayList<String>(); //1
List<Object> lo = ls; //2
lo.add(new Object()); //3
String s = ls.get(0); //4 attempts to assign an Object to a String

Line 1 is legal. Lines 2,3 and 4 aliased ls and lo. Accessing ls, a list of String, through the alias lo, one can insert arbitrary objects into it. As a result ls does not hold just Strings anymore. The Java compiler will prevent this from happening and Line 2 causes a compile time error.In general, if Foo is a subtype (subclass or subinterface) of Bar, and G is some generic type declaration, it is not the case that G<Foo> is a subtype of G<Bar>.

Wildcards

Consider the problem of writing a routine that prints out all the elements in a collection. In older version's of Java, it can be written as below:

void printCollection(Collection c) {
   Iterator i = c.iterator();
   for (k = 0; k < c.size(); k++) {
       System.out.println(i.next());
   }
}

In the later versions of Java, the same logic above can be written using generics using the syntax of enhanced for loop:

void printCollection(Collection<Object> c) {
   for (Object e : c) {
      System.out.println(e);
   }
}

The old code could be called with any kind of collection as a parameter, the new code only takes Collection<Object>, which is not a supertype of all kinds of collections. The supertype of all kinds of collections is written Collection<?>, that is, a collection whose element type matches anything. It’s called a wildcard type. Below is the code expressing the same functionality using a wildcard.

void printCollection(Collection<?> c) {
   for (Object e : c) {
      System.out.println(e);
   }
}

The above code can be called with any type of collection. Inside printCollection(), one can still read elements from c and give them type Object. This is always safe, since whatever the actual type of the collection, it does contain objects. However, It is not safe to add arbitrary objects to it.

Generic Methods

Generic methods are used more often than generic classes. A generic class is independent of whether you have a generic method. If a method is static, it has no access to the generic type parameters of the class, so if it needs to use genericity it must be a generic method. Unlike generic class one don’t usually have to specify the parameter types, because the compiler can figure that out.

public class Gen {
   public <T> void f(T x) {
       System.out.println(x.getClass().getName());
   }

   public static void main(String[] args) {
       Gen a = new Gen();
       a.f("");
       a.f(‘c’);
   }
}

Erasure

To implement generics, the Java compiler applies type erasure to:

  • Replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded. The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods.
  • Insert type casts if necessary to preserve type safety.
  • Generate bridge methods to preserve polymorphism in extended generic types.

Type erasure ensures that no new classes are created for parameterized types; consequently, generics incur no runtime overhead.

Consider the example below.

public class A {
   public static void main(String[] args) {
       System.out.println(new ArrayList<String>().getClass() == new ArrayList<Integer>().getClass());
   }
}

Here ArrayList<String> and ArrayList<Integer> seem to be of different types and so one expects them to behave differently. But the execution of the above program returns true which suggests they are of the same type.

class Fn {}
class Foo<Q> {}
class Boo<X,Y> {}

public class AbsentInformation {
   public static void main(String[] args) {
      Foo<Fn> foo = new Foo<Fn>();
      Boo<Long,Double> b = new Boo<Long,Double>();

      System.out.println(Arrays.toString(foo.getClass().getTypeParameters()));
      System.out.println(Arrays.toString(b.getClass().getTypeParameters()));
   }
} 
Output:
[Q]
[X,Y]

Class.getTypeParameters() suggest that one might be able to find out what the parameter types are. But in reality, the identifiers that are used as the parameter placeholders. This is one important feature where Java differs from C++. Java generics are implemented using erasure. So when one uses generics, the specific information is erased. The only thing one knows is that we are using an object. So both List<String> and List< Integer> have the same type at runtime.

Advantages and Disadvantages

With Java generics, we don't actually get any of the execution efficiency, because when one compiles a generic class in Java, the compiler takes away the type parameter and substitutes Object everywhere. So the compiled image for List<T> is like a List where it uses the type Object everywhere. Of course, if you now try to make a List<int>, you get boxing of all the ints. So there's a bunch of overhead there. Furthermore, to keep the VM happy, the compiler actually has to insert all of the type casts that were not written by the programmer. If it's a List of Object and we're trying to treat those Objects as Customers, at some point the Objects must be cast to Customers to keep the verifier happy. Java in their implementation does this by automatically inserting those type casts for you. Though this simplifies the programmer's job and avoids cluttering in code, there is no efficiency attained out of this optimization. Java's generics implementation relies on erasure of the type parameter, when we get to runtime, we don't actually have a faithful representation of what we had at compile time. When we apply reflection to a generic List in Java, we can't tell what the List is a List of


Generics in C++

C++ templates are a powerful mechanism for code reuse, as they enable the programmer to write code that behaves the same for data of any type.

Defining Templates

A Template Definition starts with the keyword template, followed by a list of Template Parameters. What follows is then either a class definition, or a function definition, defining a class template or a function template respectively. The template parameters introduce names into the scope of the definition, which can be types, values or templates. These names can be used just like any other name of the same kind. Then, when the template is instantiated, real types, values or templates are substituted in place of these names, and the code compiled.

Two template instantiations refer to the same template if their parameters are all the same, irrespective of any typedefs that may apply. Therefore vec1, vec2 and vec3 in the following example are all the same type.

typedef std::string MyString;
typedef std::vector<std::string> T1;
typedef std::vector<MyString> T2;
T1 vec1;
T2 vec2;
std::vector<std::string> vec3;

Multiple parameters may be specified, separated by commas in the template parameter list:

template<typename T1,typename T2>
class MyClass{};

MyClass is thus a class template with two template type parameters, T1 and T2. Every time a template is referenced with distinct template arguments, then the template is instantiated with the arguments. If the resultant code is not valid, then a compilation error will occur.

Templates can have one or more template parameters, which can be Type parameters, Non-Type parameters, or Template parameters.

Template Type Parameters

Template Type Parameters are template parameters that refer to a type; they are the most common form of template parameters. The syntax for Template Type Parameters is as below:

typename name
or
class name

Both alternatives are identical in meaning, and the choice is merely a matter of style. name is any valid C++ symbol name. Once name has been introduced in the template parameter list, then any reference to name within the body of the template automatically refers to the type of the corresponding template argument for each instantiation, and can be used anywhere a type can normally be used. e.g.

template<typename T>
void func(T value)
{
const T& ref=value;
T* p=new T;
T temp(23);
}

The code generated for func<int> is then identical to:

void func(int value)
{
const int& ref=value;
int* p=new int;
int temp(23);
}

If, however, a reference was made to func<std::string>, then a compilation error would result, as the statement std::string temp(23); is not valid.

Template Non-Type Parameters

Non-Type Template Parameters are template parameters that are values rather than types. They can be any value that is a compile-time constant. Their syntax is akin to a variable declaration, e.g.:

template<int i>
class A{};
template<double* dp>
class B{};
template<void (*func)(int)>
void c()
{}

Class template A has a non-type template parameter which is an integer, class template B has a non-type template parameter which is a pointer-to-double, and function template c has a non-type template parameter which is a pointer to a function returning void, with a single int parameter.

Template Template Parameters

Template Template Parameters enable a template to be parameterized by the name of another template. Say, for example, there is a class which contains a couple of collections of items, some strings, some integers, and one wants the users of the class to choose what type of collection to use (vector, list, stack, etc.). The natural thought is to make the collection type a template parameter. However, a collection of strings is a different type from a collection of integers, so the user would have to specify both individually if template type parameters are used. The solution is a Template Template Parameter which is shown in the code below:

template<template<typename T> class ContainerType>
class MyClass
{
ContainerType<int> intContainer;
ContainerType<std::string> stringContainer;
// rest of class
};

ContainerType is a Template Template Parameter that refers to a template with a single Template Type Parameter. One can thus say MyClass<vector> or MyClass<list> to have vectors or lists respectively.Within the template definition, the Template Template Parameters can be used just like any other template.

Using Templates

Class templates and function templates can be used anywhere normal classes and functions can be, as well as as Template Template Parameters to other templates. However, the compiler needs to know what to use for the template parameters. For class templates, the template name must be followed by a template argument list, specifying the parameters. This is a comma-separated list of expressions between angle brackets (<>). For Template Type Parameters, the corresponding expression must name a type, for Template Non-Type Parameters, the expression must evaluate to a compile-time constant of the appropriate type, and for Template Template Parameters, the expression must name a template with the correct signature. e.g.:

template<typename T,unsigned i>
struct FixedArray
{
T data[i];
};
FixedArray<int,3> a; // array of 3 integers
FixedArray<int,1+6/3> b; // array of 3 integers
template<template<typename T,typename Allocator> class Container>
struct ContainerPair
{
Container<int,std::allocator<int> > intContainer;
Container<std::string,std::allocator<std::string> > stringContainer;
};
ContainerPair<std::deque> deqCont; // two std::deques
ContainerPair<std::vector> vecCont; // two std::vectors

The second example demonstrates two things. Firstly, the template signature of the Template Template Parameter must exactly match the signature of the template passed as the argument, and standard containers have two template parameters. Secondly, if the last argument of a template parameter list is a template reference, as for Container<int,std::allocator<int> >, then the two closing angle brackets (>) must be separated by a space, to avoid being interpreted as the shift right operator (>>).

Templates requirements and concepts

Templates implicitly impose requirements on their parameters, particularly Template Type Parameters and Template Template Parameters. These requirements generally take the form of operations that must work on objects of the appropriate type, or class members that must exist and refer to objects or types or functions (with the implicit extra requirement that the type referred to is a class). A Concept is a set of requirements that describe a useful feature of a type. For example, the C++ Standard Library makes reference to things being Assignable or Copy-Constructible; these are Concepts that make the following requirements:

  • Given a type T, that type is Copy-Constructible if the expression T a(b); is defined, where b is an expression of type T, and the resultant object a has a value equivalent to the value of the expression b. That type is Assignable if the expression a=b; is defined, where a and b are expressions of type T, and the value of the object referred to by the expression a is equivalent to the value of the object referred to by the expression b after the assignment.
  • All built-in types are both Copy-Constructible and Assignable. For classes, these requirements translate into requirements on member functions.
  • A class T is Copy-Constructible if it has a constructor which can be called with one argument of type const-reference-to-T. This may be a constructor with one argument, or it may be a constructor with more than one argument, with default values provided for the remaining arguments. The object thus constructed should have a value equivalent to that of the argument.
  • A class T is Assignable if it defines a copy-assignment operator (operator=()) which has an argument of type T, or const-reference-to-T. The object to which the member function belongs should have a value equivalent to that of the argument after the completion of such a member function.

Template Specialization and Overloading

Template Specialization allows one to decide that for a specific set of template parameters, the code instantiated for a template should be different to the general case. Say, for example, one wish to use a template with a new class as a Template Type Parameter, and though it has the same semantic behaviour as the classes the template was designed for, the syntax is different, so the original template won’t compile with the new class as a parameter. The solution to this problem is specialization. Consider the following template function definition:

template<typename T>
void func(T value)
{
value.add(3);
}

This function assumes that the type substituted for the Template Type Parameter T is a class with a member function add that takes a single parameter of a type that can be constructed from an int. Thus the following classes would be OK:

class T1
{
public:
void add(int i,double d=0.0);
};
class T2
{
public:
std::string add(double d);
};

However, the class T3 below is not OK, because Add has a capital A.

class T3
{
public:
void Add(int i);
};

This means that one cannot use the original function template func for T3.

Advantages and Disadvanatges

Generics generate different classes based on the input type. An std::vector<int> is a completely different class than an std::vector<float>. However, since C++ templates create different classes for every variation of their template parameters, the size of the compiled executable is larger. Beyond that, compilation time increases greatly, since all template code must be included with each compilation unit and much more code must be generated. However, actual runtime memory footprint is typically smaller than the alternative (frees an extra void*) and performance is better, since the compiler can perform more aggressive optimizations with the known type. While a generic Java class compiles it's entire self, when using a C++ template, you only compile what you use. So, if you create an std::vector<int> and only use push_back and size, only those functions will be compiled into the object file. This eases the size of executable problem.

Conclusion

Generics in all languages have the common advantage of saving developer's effort on type casting of individual types and generalizes functionality irrespective of individual data types hence reducing code duplication, code cluttering and improving code re-usability. Compilation and generation of final executable follows a different path in C++ and Java. In Java, all the generic classes of different types are all considered the same with type Object, while in C++ each of the templates is a different class. As a result, in Java, there is not increase in the size of executable files, but there is no improvement in runtime efficiency as well. In C++, however, the size of the executable is much large, but there is improved efficiency.

References

1. http://www.artima.com/intv/genericsP.html
2. http://www.generic-programming.org
3. Generics in the Java Programming Language
4. http://www.cplusplus.com/doc/tutorial/templates
5. http://osl.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf
6. http://msdn.microsoft.com/en-us/library/ms172192.aspx#Y1524
7. http://www.justsoftwaresolutions.co.uk/articles/intrototemplates.pdf
8. http://www.sgi.com/tech/stl/stl_introduction.html