CSC/ECE 517 Fall 2011/ch6 6c p

From Expertiza_Wiki
Revision as of 19:56, 16 November 2011 by Pkarli (talk | contribs)
Jump to navigation Jump to search

Generics

Introduction

In a broad definition, generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters.

Why Generics?

This approach involves writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication.

Duplicate code is a computer programming term for a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable.

Problems with code duplication

Avoiding bulk code

Code duplication frequently creates long, repeated sections of code that differ in only a few lines or characters. The length of such routines can make it difficult to quickly understand them. Hence, code duplication is discouraged.

Purpose masking

The repetition of largely identical code sections makes it difficult to identify how they differ from one another, and therefore, identifying the specific purpose of each code section. Often, the only difference is in a parameter value. The best practice in such cases is a reusable subroutine. If the parameter type differs then it is called a generic case.

Update anomalies

Any modification to a redundant piece of code must be made for each duplicate separately which is very tedious. At worst, some locations may be missed, and for example bugs thought to be fixed may persist in duplicated locations for months or years. The best practice here is a code library.

The generics approach was popularized by Ada - programming language. Code set written using generic programming are known as generics. They are used in programming languages like Ada, Java, C++ and Visual Basic .NET. Generics create dynamically high parameterized software which can be difficult to follow.

The term generic programming was originally coined by David Musser and Alexander Stepanov.

Popular Example of Generics

Early examples of this programming approach were implemented in Ada, although the best example is the Standard Template Library (STL) in which iterators are used to access different values of an algorithm and provide them with varying datatype inputs.

The Standard Template Library (STL)

It is the C++'s Standard software library. It provides four components such algorithms, containers, functors, and iterators. We are mainly concerned with the algorithm and iterators. Here, as mentioned above algorithms are accessed with iterators where the values provided through iterators and make the algorithm work with different data types having same functionality.

The STL provides a ready-made set of common classes for C++, such as containers and associative arrays, that can be used with any built-in type and with any user-defined type that supports some elementary operations (such as copying and assignment). STL algorithms are independent of containers, which significantly reduces the complexity of the library.

Understanding Generics

Let us consider an example in STI. A sequence of data structures, e.g. a vector etc., and some algorithms to operate on them, e.g. find, sort etc., a direct approach would implement each algorithm specifically for each data structure which would be highly impossible. In the generic programming approach, each data structure returns a model of an iterator concept (a simple value type which can be dereferenced to retrieve the current value, or changed to point to another value in the sequence) and each algorithm is instead written generically with arguments of such iterators. Thus, reducing the data structure-algorithm combinations need be implemented and extensive reusability.

Iterators are of different types for e.g. forward iterators only provide movement to the next value in a sequence (e.g. suitable for a singly linked list), whereas a random-access iterator also provides direct constant-time access to any element of the sequence (e.g. suitable for a vector). An important point is that a data structure will return a model of the most general concept that can be implemented efficiently—computational complexity requirements are explicitly part of the concept definition.

Here, Arrays and Structs can be viewed as predefined generic types. Every usage of an array or struct type instantiates a new concrete type, or reuses a previous instantiated type. Array element types and struct element types are parameterized types, which are used instantiate the corresponding generic type. All this is usually built-in in the compiler and the syntax differs from other generic constructs.

Generics in Object Oriented Programming Languages

Generic programming first appeared in Ada and was subsequently adopted by many object-oriented languages. Implementation in languages such as Java and C++ are formally based on the notion of parametricity.

Generics

Define Common Block	
{ Code statements
}
Program
{ Call Common Block  // For different data types and values.
}

Example of generic programming[1]

Let us consider the memcpy() function to copy a set of array values and then generalize it using generics approach.

void* memcpy(void* region1, const void* region2, size_t n)
{
  const char* first = (const char*)region2;
  const char* last = ((const char*)region2) + n;
  char* result = (char*)region1;
  while (first != last)
    *result++ = *first++;
  return result;
}

The memcpy() function is used to copy arrays of different kinds of data. Now considering the case were the data we would like to copy is not in an array? Perhaps it is in a linked list. We can generalize a copy function to any sequence of elements? Looking at the body of memcpy(), the function's minimal requirements are that it needs to traverse through a sequence using pointer to access elements, write them to the destination, and compare pointers to know when to stop. The C++ standard library groups requirements such as these into concepts, in this case the Input Iterator concept (for region2) and the Output Iterator concept (for region1).

Generic approach code

template <typename InputIterator, typename OutputIterator>
OutputIterator
copy(InputIterator first, InputIterator last, OutputIterator result)
{
  while (first != last)
    *result++ = *first++;
  return result;
}

Using the above generic copy() function, we can now copy elements from any kind of sequence, including a linked list that exports iterators such as std::list.

#include <list>
#include <vector>
#include <iostream>
int main()
{
  const int N = 3;
  std::vector<int> region1(N);
  std::list<int> region2;
  region2.push_back(1);
  region2.push_back(0);
  region2.push_back(3);
 
  std::copy(region2.begin(), region2.end(), region1.begin());
  for (int i = 0; i < N; ++i)
    std::cout << region1[i] << " ";
  std::cout << std::endl;
}

Generics in ADA

In a generic a parameter is a value, a variable, a constant, a type, a subprogram, or even an instance of another, designated, generic unit that is used in the generic routine. For generic formal types, the syntax distinguishes between discrete, floating-point, fixed-point, access (pointer) types, etc. Some formal parameters can have default values.

To instantiate a generic unit, the programmer passes actual parameters for each formal. The generic instance then behaves just like any other unit. It is possible to instantiate generic units at run-time, for example inside a loop.

One of the ways in which the Ada language supports this characteristic is by means of generic units.

The generic here uses a package which contains a general operation to be performed.

The specification of a generic package[2]
generic
  type Element_T is private;  -- Generic formal type parameter
procedure Swap (X, Y : in out Element_T);
procedure Swap (X, Y : in out Element_T) is
  Temporary : constant Element_T := X;
begin
  X := Y;
  Y := Temporary;
end Swap;

The Swap subprogram is said to be generic. The subprogram specification is preceded by the generic formal part consisting of the reserved word generic followed by a list of generic formal parameters which may be empty. The entities declared as generic are not directly usable, it is necessary to instantiate them.

To use the generic we instantiate by

procedure Instance_Swap is new Swap (Float);
procedure Instance_Swap is new Swap (Integer);
procedure Instance_Swap is new Swap (Element_T => Stack_T);

Here, the parameters passed have different data types making the generic widely usable.

A generic using a sub program

It is possible to pass a subprogram as a parameter to a generic. The generic specifies a generic formal subprogram, complete with parameter list and return type (if the subprogram is a function). The actual must match this parameter profile. It is not necessary that the names of parameters match, though.

generic
  type Element_T is private;
  with function "*" (X, Y: Element_T) return Element_T;
function Square (X : Element_T) return Element_T;


Function description

function Square (X: Element_T) return Element_T is
begin
  return X * X;   -- The formal operator "*".
end Square;


Usage of the generic

with Square;
with Matrices;
procedure Matrix_Example is
  function Square_Matrix is new Square
    (Element_T => Matrices.Matrix_T, "*" => Matrices.Product);
  A : Matrices.Matrix_T := Matrices.Identity;
begin
  A := Square_Matrix (A);
end Matrix_Example;

To instantiate a generic unit, use the keyword new

function Square_Matrix is new Square
   (Element_T => Matrices.Matrix_T, "*" => Matrices.Product);
Advantages

The language syntax allows precise specification of constraints on generic formal parameters. For example, it is possible to specify that a generic formal type will only accept a modular type as the actual. It is also possible to express constraints between generic formal parameters; for example:

generic
   type Index_Type is (<>); -- must be a discrete type
   type Element_Type is private; -- can be any nonlimited type
   type Array_Type is array (Index_Type range <>) of Element_Type;

In this example, Array_Type is constrained by both Index_Type and Element_Type. When instantiating the unit, the programmer must pass an actual array type that satisfies these constraints.

All instances of a generic being exactly the same, it is easier to review and understand programs written by others; there are no "special cases" to take into account.

Limitations

Ada does not allow specialized generic instances, unlike C++ and requires that all generics be instantiated explicitly but all instantiations being explicit, there are no hidden instantiations that might make it difficult to understand the program.

Ada does not permit "template metaprogramming", because it does not allow specializations.





GC++
Generic approach code

See also

External Links:

References:

R.Venkat Rajendran, White Paper on Unit Testing

Benchmark Conclusions