CSC/ECE 517 Fall 2011/ch6 6c p
Generics
Introduction
In a broad definition, generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters.
Why Generics?
This approach involves writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication.
Duplicate code is a computer programming term for a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable.
Problems with code duplication
Avoiding bulk code
Code duplication frequently creates long, repeated sections of code that differ in only a few lines or characters. The length of such routines can make it difficult to quickly understand them. Hence, code duplication is discouraged.
Purpose masking
The repetition of largely identical code sections makes it difficult to identify how they differ from one another, and therefore, identifying the specific purpose of each code section. Often, the only difference is in a parameter value. The best practice in such cases is a reusable subroutine. If the parameter type differs then it is called a generic case.
Update anomalies
Any modification to a redundant piece of code must be made for each duplicate separately which is very tedious. At worst, some locations may be missed, and for example bugs thought to be fixed may persist in duplicated locations for months or years. The best practice here is a code library.
The generics approach was popularized by Ada - programming language. Code set written using generic programming are known as generics. They are used in programming languages like Ada, Java, C++ and Visual Basic .NET. Generics create dynamically high parameterized software which can be difficult to follow.
The term generic programming was originally coined by David Musser and Alexander Stepanov.
Popular Example of Generics
Early examples of this programming approach were implemented in Ada, although the best example is the Standard Template Library (STL) in which iterators are used to access different values of an algorithm and provide them with varying datatype inputs.
The Standard Template Library (STL)
It is the C++'s Standard software library. It provides four components such algorithms, containers, functors, and iterators. We are mainly concerned with the algorithm and iterators. Here, as mentioned above algorithms are accessed with iterators where the values provided through iterators and make the algorithm work with different data types having same functionality.
The STL provides a ready-made set of common classes for C++, such as containers and associative arrays, that can be used with any built-in type and with any user-defined type that supports some elementary operations (such as copying and assignment). STL algorithms are independent of containers, which significantly reduces the complexity of the library.
Understanding Generics
Let us consider an example in STI. A sequence of data structures, e.g. a vector etc., and some algorithms to operate on them, e.g. find, sort etc., a direct approach would implement each algorithm specifically for each data structure which would be highly impossible. In the generic programming approach, each data structure returns a model of an iterator concept (a simple value type which can be dereferenced to retrieve the current value, or changed to point to another value in the sequence) and each algorithm is instead written generically with arguments of such iterators. Thus, reducing the data structure-algorithm combinations need be implemented and extensive reusability.
Iterators are of different types for e.g. forward iterators only provide movement to the next value in a sequence (e.g. suitable for a singly linked list), whereas a random-access iterator also provides direct constant-time access to any element of the sequence (e.g. suitable for a vector). An important point is that a data structure will return a model of the most general concept that can be implemented efficiently—computational complexity requirements are explicitly part of the concept definition.
Here, Arrays and Structs can be viewed as predefined generic types. Every usage of an array or struct type instantiates a new concrete type, or reuses a previous instantiated type. Array element types and struct element types are parameterized types, which are used instantiate the corresponding generic type. All this is usually built-in in the compiler and the syntax differs from other generic constructs.
Generics in Object Oriented Programming Languages
Generic programming first appeared in Ada and was subsequently adopted by many object-oriented languages. Implementation in languages such as Java and C++ are formally based on the notion of parametricity.
Generics
Define Common Block { Code statements }
Program { Call Common Block // For different data types and values. }
Example of generic programming[1]
Let us consider the memcpy() function to copy a set of array values and then generalize it using generics approach.
void* memcpy(void* region1, const void* region2, size_t n) { const char* first = (const char*)region2; const char* last = ((const char*)region2) + n; char* result = (char*)region1; while (first != last) *result++ = *first++; return result; }
The memcpy() function is used to copy arrays of different kinds of data. Now considering the case were the data we would like to copy is not in an array? Perhaps it is in a linked list. We can generalize a copy function to any sequence of elements? Looking at the body of memcpy(), the function's minimal requirements are that it needs to traverse through a sequence using pointer to access elements, write them to the destination, and compare pointers to know when to stop. The C++ standard library groups requirements such as these into concepts, in this case the Input Iterator concept (for region2) and the Output Iterator concept (for region1).
Generic approach code
template <typename InputIterator, typename OutputIterator> OutputIterator copy(InputIterator first, InputIterator last, OutputIterator result) { while (first != last) *result++ = *first++; return result; }
Using the above generic copy() function, we can now copy elements from any kind of sequence, including a linked list that exports iterators such as std::list.
#include <list> #include <vector> #include <iostream>
int main() { const int N = 3; std::vector<int> region1(N); std::list<int> region2;
region2.push_back(1); region2.push_back(0); region2.push_back(3); std::copy(region2.begin(), region2.end(), region1.begin());
for (int i = 0; i < N; ++i) std::cout << region1[i] << " "; std::cout << std::endl; }
Generics in ADA
In a generic a parameter is a value, a variable, a constant, a type, a subprogram, or even an instance of another, designated, generic unit that is used in the generic routine. For generic formal types, the syntax distinguishes between discrete, floating-point, fixed-point, access (pointer) types, etc. Some formal parameters can have default values.
To instantiate a generic unit, the programmer passes actual parameters for each formal. The generic instance then behaves just like any other unit. It is possible to instantiate generic units at run-time, for example inside a loop.
One of the ways in which the Ada language supports this characteristic is by means of generic units.
The generic here uses a package which contains a general operation to be performed.
The specification of a generic package[2]
generic type Element_T is private; -- Generic formal type parameter procedure Swap (X, Y : in out Element_T);
procedure Swap (X, Y : in out Element_T) is Temporary : constant Element_T := X; begin X := Y; Y := Temporary; end Swap;
The Swap subprogram is said to be generic. The subprogram specification is preceded by the generic formal part consisting of the reserved word generic followed by a list of generic formal parameters which may be empty. The entities declared as generic are not directly usable, it is necessary to instantiate them.
To use the generic we instantiate by
procedure Instance_Swap is new Swap (Float); procedure Instance_Swap is new Swap (Integer); procedure Instance_Swap is new Swap (Element_T => Stack_T);
Here, the parameters passed have different data types making the generic widely usable.
A generic using a sub program
It is possible to pass a subprogram as a parameter to a generic. The generic specifies a generic formal subprogram, complete with parameter list and return type (if the subprogram is a function). The actual must match this parameter profile. It is not necessary that the names of parameters match, though.
generic type Element_T is private; with function "*" (X, Y: Element_T) return Element_T; function Square (X : Element_T) return Element_T;
Function description
function Square (X: Element_T) return Element_T is begin return X * X; -- The formal operator "*". end Square;
Usage of the generic
with Square; with Matrices; procedure Matrix_Example is function Square_Matrix is new Square (Element_T => Matrices.Matrix_T, "*" => Matrices.Product); A : Matrices.Matrix_T := Matrices.Identity; begin A := Square_Matrix (A); end Matrix_Example;
To instantiate a generic unit, use the keyword new
function Square_Matrix is new Square (Element_T => Matrices.Matrix_T, "*" => Matrices.Product);
Advantages
The language syntax allows precise specification of constraints on generic formal parameters. For example, it is possible to specify that a generic formal type will only accept a modular type as the actual. It is also possible to express constraints between generic formal parameters; for example:
generic type Index_Type is (<>); -- must be a discrete type type Element_Type is private; -- can be any nonlimited type type Array_Type is array (Index_Type range <>) of Element_Type;
In this example, Array_Type is constrained by both Index_Type and Element_Type. When instantiating the unit, the programmer must pass an actual array type that satisfies these constraints.
All instances of a generic being exactly the same, it is easier to review and understand programs written by others; there are no "special cases" to take into account.
Limitations
Ada does not allow specialized generic instances, unlike C++ and requires that all generics be instantiated explicitly but all instantiations being explicit, there are no hidden instantiations that might make it difficult to understand the program.
Ada does not permit "template metaprogramming", because it does not allow specializations.
Generics in C++
When creating containers of objects it is possible to write specific implementations for each data type contained, even if the code is virtually identical except for different data types. In C++, this duplication of code can be circumvented by defining a template class. C++ has can support Generic Programming very well through this template system.
Concepts have no direct representation in C++, so they are represented only within the documentation for a generic library. The documentation for the SGI Standard Template Library established the de facto standard for documenting concepts.
. Valid Expressions are C++ expressions which must compile successfully for the objects involved in the expression to be considered models of the concept.
Associated Types are types that are related to the modeling type in that they participate in one or more of the valid expressions. Typically associated types can be accessed either through typedefs nested within a class definition for the modeling type, or they are accessed through a traits class.
Invariants are run-time characteristics of the objects that must always be true, that is, the functions involving the objects must preserve these characteristics. The invariants often take the form of pre-conditions and post-conditions.
Complexity Guarantees are maximum limits on how long the execution of one of the valid expressions will take, or how much of various resources its computation will use.
The specification of a generic package[2]
generic type Element_T is private; -- Generic formal type parameter procedure Swap (X, Y : in out Element_T);
procedure Swap (X, Y : in out Element_T) is Temporary : constant Element_T := X; begin X := Y; Y := Temporary; end Swap;
The Swap subprogram is said to be generic. The subprogram specification is preceded by the generic formal part consisting of the reserved word generic followed by a list of generic formal parameters which may be empty. The entities declared as generic are not directly usable, it is necessary to instantiate them.
To use the generic we instantiate by
procedure Instance_Swap is new Swap (Float); procedure Instance_Swap is new Swap (Integer); procedure Instance_Swap is new Swap (Element_T => Stack_T);
Here, the parameters passed have different data types making the generic widely usable.
A generic using a sub program
It is possible to pass a subprogram as a parameter to a generic. The generic specifies a generic formal subprogram, complete with parameter list and return type (if the subprogram is a function). The actual must match this parameter profile. It is not necessary that the names of parameters match, though.
generic type Element_T is private; with function "*" (X, Y: Element_T) return Element_T; function Square (X : Element_T) return Element_T;
Function description
function Square (X: Element_T) return Element_T is begin return X * X; -- The formal operator "*". end Square;
Usage of the generic
with Square; with Matrices; procedure Matrix_Example is function Square_Matrix is new Square (Element_T => Matrices.Matrix_T, "*" => Matrices.Product); A : Matrices.Matrix_T := Matrices.Identity; begin A := Square_Matrix (A); end Matrix_Example;
To instantiate a generic unit, use the keyword new
function Square_Matrix is new Square (Element_T => Matrices.Matrix_T, "*" => Matrices.Product);
Advantages
The language syntax allows precise specification of constraints on generic formal parameters. For example, it is possible to specify that a generic formal type will only accept a modular type as the actual. It is also possible to express constraints between generic formal parameters; for example:
generic type Index_Type is (<>); -- must be a discrete type type Element_Type is private; -- can be any nonlimited type type Array_Type is array (Index_Type range <>) of Element_Type;
In this example, Array_Type is constrained by both Index_Type and Element_Type. When instantiating the unit, the programmer must pass an actual array type that satisfies these constraints.
All instances of a generic being exactly the same, it is easier to review and understand programs written by others; there are no "special cases" to take into account.
Limitations
Ada does not allow specialized generic instances, unlike C++ and requires that all generics be instantiated explicitly but all instantiations being explicit, there are no hidden instantiations that might make it difficult to understand the program.
Ada does not permit "template metaprogramming", because it does not allow specializations.
GC++
Generic approach code
See also
External Links:
- Test::Unit
- Ruby-doc
- Shoulda
- RSpec
- RSpec Documentation
- Shoulda
- Cucumber
- Unit testing
- Test-driven development
- Riot