CSC/ECE 517 Fall 2009/wiki2 1 MP

From Expertiza_Wiki
Jump to navigation Jump to search

Metaprogramming

The process of writing code-generating programs is known as Metaprogramming [1]. It means writing programs to generate programs. These programs take in either other programs or themselves as their data and manipulate them or finish the run-time tasks at compile time itself. This provides flexibility to the developers as the new changes could be reflected without recompilation. The time it takes the programmers to write such programs is the same as it would take them to write all the code manually [2]. There are two ways that a Metaprogramming works – either through the Application Programming Interfaces (APIs) where the internal structure of the run-time engine is exposed to the code at the programmer’s level or through dynamic execution of programming commands that are represented as expressions. Though it is possible for a single programming language to incorporate both of these approaches, generally they provide better support for one approach than the other [1].

Advantages of Metaprogramming

Many programming languages require lots of statements to be typed to perform the tasks but Metaprogramming eliminates much of the typing to produce the output. This reduces errors in the code that could be caused by mistyping [1]. There are various uses of Metaprogamming. One of them is for using the data at run-time through pre-generated tables. For instance, say for some application a look-up table needs to be created to hold the computation of logarithmic values for the range of numbers. This task can be accomplished either at run-time or write a program to build the table at compile time or by manually calculating it. Though building the table at run-time seems reasonable, it might delay the start up of the program. In such scenarios it is more efficient to write the program to build the static tables [1]. Other most common usage of Metaprogramming is to replace boilerplate code. Some of the programs might contain a set of error handlers or a huge list of variable declarations for every instance of the program. In these circumstances it is handy to use the Metaprograms such that the code is converted to its respective programming language code before compilation [1].

Macros and Code generators

Metaprogramming can be done through the macro expansions or code generators. Code generation can be done either during compile-time or run-time or before compiling. All these forms of code generation are considered to be forms of Metaprogramming [4].

Macro Languages

Metaprograms provide a set of domain-specific languages which are easier to write and have better maintenance than the target language. These domain-specific languages can be created using macro languages [1]. Metaprogramming involves the usage of the textual macro language that consists of textual macros. Textual macros do not have the knowledge of the programming language but they affect the text of the programming language. The most widely used textual macro systems are C preprocessor and M4 macro processor [1].

C preprocessor

C programming language does not have good code-generation capability hence Metaprogramming is done through textual macros. These macros are beneficial in many ways as they avoid the overhead involved in function call and the computations can be accomplished without writing the code statements multiple times [1]. Consider a macro SUM, which adds a, b and stores the sum in a.

#define SUM(a, b, type) { type sum; sum = a + b; a = sum; }

This macro can be called in the function in the following way:

#define SUM(a, b, type) { type sum; sum = a + b; a = sum; }

void main()
{
    float a = 10, b = 20;
    SUM(a, b, float);
    SUM(a, a, float);
    printf("Value of a: %d \n", a);	
}

Output: 
Value of a: 40

When the above program is run, it replaces the SUM text in the program with the set of statements defined as the macro directive. However, this comes with its own set of disadvantages – 1. Textual substitution cannot work as polymorphism in object-oriented languages. In C language, if different type information has to be passed, then a different macro has to be defined or the type information should be included in the macro definition like shown in the example above.
2. This pre-processor restricts the number of arguments that can be passed to the macros. This reduces the flexibility.
3. The variable declared in the macro should not conflict with the variable passed to the macro. This gets very messy and introduces lots of errors and it is time-consuming to check the declaration of variables every time [1].

Code Generators

Code generators (like Flex, Bison etc.) are the programs that generate programs. Consider the code generator Embedded SQL that easily merges the SQL entities into C and extends the language so that the database access from C is done easily. This code generator produces C programs as output and makes the database access easy and error-free which would be otherwise done through the set of libraries [1]. Code generators generally take either a parse tree or an abstract syntax tree as a input which are converted to instructions while processing [3]. If a table of values have to be computed and used in the program, for instance if we need to generate the cube root of numbers from 5 to 100 and store it in an array there are three ways to accomplish this task. One way is to manually calculate all the cube roots of the numbers and initialize the array with those values which is not certainly the preferred option. Second approach is to write a program that calculates the cube roots and populates the array at run time. This takes more of user time and not a good option either. The third option uses the code generation strategy which is ideal to use under such circumstances. Consider the example below [1] -

//define a macro
Macro:array_name:data-type:start-index:end-index:default-value:expression-to-evaluate(parameter)

void main()
{
printf("%d", array_name[index]);
}

//Build a code generator for "Macro" that evaluates "expression-to-evaluate"
//Run the code generator and the array is populated with respective computations

In the above example, if a expression that has to be evaluated is sqrt, then a code generator is built for sqrt under the name Macro. The code generator is run so that it populates all the values in the corresponding array_name. In the program, when the array_name is referred to, it returns the value that was calculated and populated by the code generator.

Metalanguage is used to write the metaprogams. Some programming languages are also their own metalanguages. This property is known as reflection and it helps metaprogramming [2].

Metaprogramming in Object-oriented languages

Let us compare the way metaprogramming is implemented in the following object-oriented languages - Ruby and C++.

Ruby

In Ruby, most of the declarations are also the executable statements and since Ruby is a dynamic language and is reflective it very well supports metaprogramming. Ruby provides lot of methods that aids in metaprogramming. Ruby implements polymorphism through duck typing which is a metaprogramming. Here we will look at the example using the instance_eval. eval is used to evaluate a string or a block and and the instance_eval is used to evaluate the object on which it is called upon [5].

[10,20].instance_eval('size')

Output: 2

# ''instance_eval'' used on the array to calculate the average value of an array
# the pseudo code is given below

[10,20].instance_eval{''use the addition operator using inject''  / ''size''}

Output: 15

Ruby also various language constructs that provide metaprogramming. Some of them are class_variable_get, send_message, const_set, const_get etc [5]. Consider another example [6] as shown below. If the user wants to know what the date and time 2 days ago was, he needs to just type in 2.days.ago. This computation is much easier and straightforward in Ruby as everything is defined as an object. It is similar to defining a macro where the values are computed based on the input parameter passed.

class Display
  def days
    self * 24 * 60 * 60
  end
  def ago
    Time.now - self
  end
end

2.days.ago
Output: ''date-time (a day before)''

C++

C++ has Macros through which act as metaprograms as explained in earlier section. This section shows how metaprogramming is done using templates and is known as Template metaprogramming. The C++ compiler uses the templates in order to generate the temporary source code (compile-time execution). This code is later on merged with the rest of the code and then compiled together as whole. These templates can output anything like data structures, constants(compile-time constants), functions etc [7]. Consider an example[7] which calculates the factorial of the numbers. This program is return in a normal way and is factorial of a number is calculated at run-time.

int fact(int n) 
{
    if (n == 0)
       return 1;
    return n * fact(n - 1);
}
void main()
{
    printf("%d", fact(5)); //fact(5) = 120
}
Output: 120

However, in the code below templates are used. The computation of the factorial of 5 is done at compile time and the value is stored. Thus if the run-time value needs to access this value, it can access it as if it were stored in some table reference.

template <int num>
struct fact 
{
    enum { ret = num * fact<num - 1>::ret };
};
 
void main()
{
    int val = fact<5>::ret; // returns 120 (5!)
    printf("%d", val); 
}

Output: 120

Metaprogramming in Ruby Vs C++

Though metaprogramming exists in C++ it is not as widely used as it is used in Ruby. Most of the programmers might not even exploit the effectiveness of using templates in their programs. C++ does not have run-time extensibility and hence libraries cannot be generated by linking the templates. Hence templates are compiled at compile time and all the other files which use these templates must include the corresponding header files [8]. Hence code maintainability is difficult with templates. To use metaprogramming, templates or macros should be explicitly defined in C++. In Ruby, most of the language constructs itself provide metaprogramming and they are implicitly used and there is no need to explicitly define them. Most of the novice users might not even use templates and thus though metaprogramming exists, it might go unnoticed in C++. In Ruby the constructs are so widely present that the developers end up writing metaprograms even without their knowledge. Some of the C++ compilers might not support templates where as such a problem is not there in Ruby.

Conclusion

Few of the features that are available in one language might be available only through the Metaprograms in other languages. However, the main reason behind using Metaprograms is not the design issue of a programming language but easier maintenance [1]. Metaprogramming enahnces the performance of the programming language and also gives the ability to the programming languages to reduce the redundant code and handles it elegantly.

References

1. http://www.ibm.com/developerworks/linux/library/l-metaprog1.html
2. http://en.wikipedia.org/wiki/Metaprogramming
3. http://en.wikipedia.org/wiki/Code_generation_%28compiler%29
4. http://brandonbyars.com/blog/articles/2008/03/29/code-generation-and-metaprogramming
5. http://weare.buildingsky.net/2009/08/25/rubys-metaprogramming-toolbox
6. http://video.kiberpipa.org/media/SU_David_Krmpotic-Meta-programiranje_v_Ruby-ju/Ruby_Metaprogramming.pdf
7. http://en.wikipedia.org/wiki/Template_metaprogramming
8. http://articles.techrepublic.com.com/5100-10878_11-1050219.html

External Links

1. The Ruby Object Model and Metaprogramming
2. Metaprogramming
3. Template Metaprogramming
4. Code Generation Vs Metaprogramming
5. Ruby Metaprogramming
6. Ruby Metaprogramming Techniques