CSC/ECE 517 Spring 2013/ch1 1h jc

From Expertiza_Wiki
Jump to navigation Jump to search

Metaprogramming in statically typed languages

Metaprogramming in statically typed languages

Introduction

What is metaprogramming

Metaprogramming refers to writing programs that can manipulate other programs or themselves. The program that is doing the manipulating is called the metaprogram, and the program that is being manipulated is called the object program. The language of the metaprogram is referred to as the metalanguage and the language of the object program is called the object language. A language has the ability of reflection if it can be used to write a metaprogram that can manipulate a program written in the same language. <ref name="mp1">http://en.wikipedia.org/wiki/Metaprogramming</ref>

Metaprogramming in statically typed languages

Metaprogramming can be accomplished in both statically and dynamically typed languages, however dynamically typed languages have some advantages over statically typed languages when it comes to metaprogramming.

One example of a metaprogramming task is to add a method to a particular class at runtime. Sample code in Ruby to perform this task is outlined below. First is a definition of a Ruby class called TestAddMethodAtRuntime:

 class TestAddMethodAtRuntime
   def originalMethod
       puts "Original Method"
   end
 end

The code below created a new instance of the class. After that, it defines a new method called newlyAddedMethod. Then the original method is called, follow by the newly added method at runtime:

 t = TestAddMethodAtRuntime.new
 def t.newlyAddedMethod
   puts "Newly Added Method"
 end
 t.originalMethod
 puts "\n"
 t.newlyAddedMethod

Below is the output of the code above:

 Original Method
 Newly Added Method

Accomplishing something like this in a statically typed language would be much more difficult or impossible depending on the language. It is for this reason that dynamically typed languages are usually viewed as being better equipped to handle metaprogramming overall.

Implementation

Programming Language API

One way that metaprogramming can be possible in a language is through an API that helps programmers achieve a metaprogramming task. An example of this is a language that provides an API for reflection. One type of reflection is introspection in which the program can access the source code of the program itself. <ref name="ref1">https://blogs.oracle.com/java/entry/metaprogramming_manipulating_data_about_data</ref>

Below is an example from the Java API to illustrate how an API can be used to achieve some metaprogramming tasks:

  import java.lang.reflect.*; 
  public class DumpMethods {
  public static void main(String[] args) {
    	try {
          Class c = Class.forName(args[0]);
          Method m[] = c.getDeclaredMethods();
          for (int i = 0; i < m.length; i++) {
             System.out.println(m[i].toString());
          }
    	}
    	catch (Throwable e) {
       	System.err.println(e);
    	}
  }

The above example prints out all of the methods that are declared for a particular class. This is an example of metaprogramming since the program is accessing the source code of itself.

Program transformation system

A program transformation system is something that takes a program as the input, and outputs a different program. One example of a program transformation system is the Java compiler. The Java compiler takes a Java program as its input, and outputs a .class file with platform independent bytecode in it. Another example of a program transformation system is a Java decompiler, which takes bytecode as an input and outputs a Java program.

In the case of the Java compiler, the program input is being output into another program in a different "language". There are other types of transformation systems that take a program as an input and output a different program in the same language. One example of this would be program migration which migrates source code to a newer or older version of the same language. This can be useful if upgrading a software system to run on a newer version of a framework. <ref name="pm1">http://www.program-transformation.org/Transform/ProgramMigration</ref>

Common Uses

Pre-generate static data at compile time

One common use of metaprogramming in statically typed languages is to write programs that will pre-generate tables of data for use at runtime.<ref name="use1">Jonathan Bartlett, "The art of metaprogramming, Part 1: Introduction to metaprogramming", 20 Oct 2005 http://www.ibm.com/developerworks/library/l-metaprog1/index.html#N10052</ref>

One simple but useful code generator is to build static lookup tables. Often, in order to build fast functions in C programming, we simply create a lookup table of all of the answers. This means that we either need to pre-compute them by hand (which is wasteful of your time) or build them at runtime (which is wasteful of the user's time).

In following example we will build a generator that will take a function or set of functions on an integer and build lookup tables for the answer.

To think of how to make such a program, we can start from the end and work backward. Firstly we need a lookup table that will return square roots of numbers between 5 and 20. A simple program can be written to generate such a table like this:

Generate and use a lookup table of square roots

       /* our lookup table */
       double square_roots[21];
       /* function to load the table at runtime */
       void init_square_roots()
       {
          int i;
          for(i = 5; i < 21; i++)
            {
              square_roots[i] = sqrt((double)i);
            }
       }

With that single macro, we can take away a lot of work for any program that has to generate mathematical tables indexed by integer. A little extra work would also allow tables containing full struct definitions; a little more would ensure that space isn't wasted at the front of the array with useless empty entries.

       /* program that uses the table */
       int main ()
       {
          init_square_roots();
          printf("The square root of 5 is %f\n", square_roots[5]);
          return 0;
       }

Now, to convert this to a statically initialized array, you would remove the first part of the program and replace it with something like this, calculated by hand:

Square root program with a static lookup table

       double square_roots[] = {
          /* these are the ones we skipped */ 0.0, 0.0, 0.0, 0.0, 0.0
          2.236068, /* Square root of 5 */
          2.449490, /* Square root of 6 */
          2.645751, /* Square root of 7 */
          2.828427, /* Square root of 8 */
          3.0, /* Square root of 9 */
          ...
          4.472136 /* Square root of 20 */
          };

What is needed is a program that will produce these values and print them out in a table like the previous one so they are loaded in at compile-time.

Code generator for the table macro

          #!/usr/bin/perl  
          #
          #tablegen.pl 
          #
          ##Puts each program line into $line
          while(my $line = <>)
          {
            #Is this a macro invocation?
            if($line =~ m/TABLE:/)
            {
               #If so, split it apart into its component pieces
               my ($dummy, $table_name, $type, $start_idx, $end_idx, $default, $procedure) = split(m/:/, $line, 7);
               #The main difference between C and Perl for mathematical expressions is that
               #Perl prefixes its variables with a dollar sign, so we will add that here
               $procedure =~ s/VAL/\$VAL/g;
               #Print out the array declaration
               print "${type} ${table_name} [] = {\n";
               #Go through each array element
               foreach my $VAL (0 .. $end_idx)
               {
                  #Only process an answer if we have reached our starting index
                  if($VAL >= $start_idx)
                  {
                     #evaluate the procedure specified (this sets $@ if there are any errors)
                     $result = eval $procedure;
                     die("Error processing: $@") if $@;
                  }
                  else
                  {
                     #if we haven't reached the starting index, just use the default
                     $result = $default;
                  }
                  #Print out the value
                  print "\t${result}";
                  #If there are more to be processed, add a comma after the value
                  if($VAL != $end_idx)
                  {
                     print ",";
                  }
                  print "\n"
               }
               #Finish the declaration
               print "};\n";
            }
            else
            {
               #If this is not a macro invocation, just copy the line directly to the output
               print $line;
            }
          }

To run this program, do this:

Running the code generator

          ./tablegen.pl < sqrt.in > sqrt.c
          gcc sqrt.c -o sqrt
          ./a.out

Mini-language for boiler-plate

If you have a large application where many of the functions include a lot of boilerplate code, it is often a good idea to create a mini-language that allows you to work with your boilerplate code in an easier fashion.<ref name="boiler">K. Czarnecki, "Generative Programming" chapter 8, Static Metaprogramming in C++</ref> This mini-language will then be converted into your regular source code language before compiling.<ref name="use2">Jonathan Bartlett, "The art of metaprogramming, Part 2: Metaprogramming using Scheme", 02 May 2006 http://www.ibm.com/developerworks/linux/library/l-metaprog2/index.html</ref>

The following is an example:

Let's say that we are building a CGI application consisting of many independent CGI scripts. In most CGI applications, much of the state is stored in a database, but only a session ID is passed to each script via a cookie.

However, in nearly every page we need to know the other standard information (such as the username, group number, the current job being worked on, whatever else information is pertinent). In addition, we need to redirect the user if they do not have an appropriate cookie.

           (define (handle-cgi-request req)
             (let (
                    (session-id (webserver:cookie req "sessionid")))
               (if (not (webserver:valid-session-id session-id))
                   (webserver:redirect-to-login-page)
                   (let (
                         (username (webserver:username-for-session session-id))
                         (group (webserver:group-for-user username))
                         (current-job (webserver:current-job-for-user username)))
                      ;;Code for processing goes here
                      ))))

While some of that can be handled by a procedure, the bindings certainly cannot. However, we can turn most of it into a macro. The macro can be implemented like this:

           (define-syntax cgi-boilerplate
             (lambda (x)
               (syntax-case x ()
                (
                  (cgi-boilerplate expr)
                  (datum->syntax-object
                    (syntax k)
                     (let (
                           (session-id (webserver:cookie req "sessionid")))
                          (if (not (webserver:valid-session-id session-id))
                              (webserver:redirect-to-login-page)
                              (let (
                                     (username (webserver:username-for-session session-id))
                                     (group (webserver:group-for-user username))
                                     (current-job (webserver:current-job-for-user username)))
                                     (syntax-object->datum (syntax expr))))))
                 )
            )))

We can now create new forms based on our boilerplate code by doing the following:

           (define (handle-cgi-request req)
             (cgi-boilerplate
              (begin
                ;;Do whatever I want here
                )))

In addition, since we are not defining our variables explicitly, adding new variable definitions to our boilerplate won't affect its calling conventions, so new features can be added without having to create a whole new function.

In any large project, there are inevitably templates to follow which cannot be reduced to functions, usually because of the bindings being created. Using boilerplate macros can make maintenance of such templated code much easier.

Likewise, other standard macros can be created which make use of variables defined in the boilerplate. Using macros like this significantly reduces typing because you do not have to constantly be writing and rewriting variable bindings, derivations, and parameter passing. This also reduces the potential for errors in such code.

Realize though that boilerplate macros are not a panacea. There are many significant problems that can occur, including:

  • Accidentally overwriting bindings by introducing a variable name that was previously defined in a macro.
  • Difficulty tracing problems because the inputs and the outputs of the macros are implicit, not explicit.

These can be largely avoided by doing a few things in conjunction with your boilerplate macros:

  • Have a naming convention which clearly labels macros as such, as well as indicate that a variable came from boilerplate code. This could be done by affixing -m to macros and -b to variables defined within a boilerplate.
  • Carefully document all boilerplate macros, especially the introduced variable bindings and all changes between versions.
  • Only use boilerplate macros when the savings in repetitiveness clearly outweigh the negatives of implicit functionality.

Abbreviate statements and prevent mistakes

A lot of programming languages make you write really verbose statements to do really simple things. Code-generating programs allow you to abbreviate such statements and save a lot of typing, which also prevents a lot of mistakes because there is less chance of mistyping.

Metaprogramming Framework in Java

Reflection

Introduction to reflection

Reflection is the ability to introspect metalevel information about the program structure itself at runtime. Usually this metalevel information is modeled using the general abstraction mechanisms available in the language. In Java, reflection enables to discover information about the loaded classes:

  • Fields,
  • Methods, and
  • Constructors
  • Generics information
  • Metadata annotations

It also enables to use these metaobjects to their instances in runtime environment.<ref name="meta_java">Abdelmonaim Remani, "The Art of Metaprogramming in Java", Jul 19, 2012 http://www.slideshare.net/PolymathicCoder/the-art-of-metaprogramming-in-java</ref>

The following is the Java metalevel architecture<ref name="reflection">Mika Haapakorpi, "Meta Programming In Java"</ref>

Dynamic proxy classes

A dynamic proxy class implements a list of interfaces specified at runtime when the class is created.

  • A proxy interface is an interface that is implemented by a proxy class.
  • A proxy instance is an instance of a proxy class which has an associated invocation handler object.

The following is an example of dynamic proxy classes

Generics

Generics are a facility of generic programming that was added to the Java programming language in 2004 as part of J2SE 5.0. They allow "a type or method to operate on objects of various types while providing compile-time type safety." The generics can be used in classes, interfaces, methods and constructors.<ref name="reflection">Mika Haapakorpi, "Meta Programming In Java"</ref>

Two new types in generics

Two new types in generics:

  • Parametrized types
  • Type variables

A type variable is an unqualified identifier. Class and interface declarations can have type arguments (type variables). Method and constructors definitions can have type arguments (type variables).

The following is an example for type variables

      List<String> anExample = new ArrayList<String>()
  • List : interface
  • ArrayList : class
  • String : class (the actual type argument)
  • List<String> and ArrayList<String> : parametrized type

Class/Interface Declarations

The following is an example of class declaration:

      public interface List<E> {
        void add(Ex);
        Iterator<E> iterator();
      }
      public interface Iterator<E> {
        E next();
        booleanhasNext();
      }
      List<String> anExample;
      anExample.add(”sdfdfss”);
      anExample.add(new Object()); // compile time error
      String aTest = anExample.iterator().next();

Metadata annotation

An annotation, in the Java computer programming language, is a form of syntactic metadata that can be added to Java source code. Classes, methods, variables, parameters and packages may be annotated.<ref name="reflection">Mika Haapakorpi, "Meta Programming In Java"</ref>

The following is an example of annotation:

       /**
       * Designates a formatter to pretty-print the annotated class.
       */
       public @interface PrettyPrinter{
       Class<? extends Formatter> value();
       }
       // Single-member annotation with Class
       // member restricted by bounded wildcard
       // The annotation presumes the existence of this class.
       class GorgeousFormatterimplements Formatter { ... }
       @PrettyPrinter(GorgeousFormatter.class)
       public class Petunia { ... }

‘@’ indicates the start of an annotation definition.

References

<references/>