CSC/ECE 517 Spring 2013/ch1 1h jc

From Expertiza_Wiki
Revision as of 22:58, 17 February 2013 by Cyang14 (talk | contribs) (→‎Reflection)
Jump to navigation Jump to search

Metaprogramming in statically typed languages

Introduction

What is metaprogramming

Metaprogramming in statically typed languages

Implementation

Exposing the internals of the compiler as an API

Program transformation system

Metaprogramming using Scheme

Common Uses

Pre-generate static data at compile time

One common use of metaprogramming in statically typed languages is to write programs that will pre-generate tables of data for use at runtime.

One simple but useful code generator is to build static lookup tables. Often, in order to build fast functions in C programming, we simply create a lookup table of all of the answers. This means that we either need to pre-compute them by hand (which is wasteful of your time) or build them at runtime (which is wasteful of the user's time).

In following example we will build a generator that will take a function or set of functions on an integer and build lookup tables for the answer.

To think of how to make such a program, we can start from the end and work backward. Firstly we need a lookup table that will return square roots of numbers between 5 and 20. A simple program can be written to generate such a table like this:

Generate and use a lookup table of square roots

       /* our lookup table */
       double square_roots[21];
       /* function to load the table at runtime */
       void init_square_roots()
       {
          int i;
          for(i = 5; i < 21; i++)
            {
              square_roots[i] = sqrt((double)i);
            }
       }

With that single macro, we can take away a lot of work for any program that has to generate mathematical tables indexed by integer. A little extra work would also allow tables containing full struct definitions; a little more would ensure that space isn't wasted at the front of the array with useless empty entries.

       /* program that uses the table */
       int main ()
       {
          init_square_roots();
          printf("The square root of 5 is %f\n", square_roots[5]);
          return 0;
       }

Now, to convert this to a statically initialized array, you would remove the first part of the program and replace it with something like this, calculated by hand:

Square root program with a static lookup table

       double square_roots[] = {
          /* these are the ones we skipped */ 0.0, 0.0, 0.0, 0.0, 0.0
          2.236068, /* Square root of 5 */
          2.449490, /* Square root of 6 */
          2.645751, /* Square root of 7 */
          2.828427, /* Square root of 8 */
          3.0, /* Square root of 9 */
          ...
          4.472136 /* Square root of 20 */
          };

What is needed is a program that will produce these values and print them out in a table like the previous one so they are loaded in at compile-time.

Code generator for the table macro

          #!/usr/bin/perl  
          #
          #tablegen.pl 
          #
          ##Puts each program line into $line
          while(my $line = <>)
          {
            #Is this a macro invocation?
            if($line =~ m/TABLE:/)
            {
               #If so, split it apart into its component pieces
               my ($dummy, $table_name, $type, $start_idx, $end_idx, $default, $procedure) = split(m/:/, $line, 7);
               #The main difference between C and Perl for mathematical expressions is that
               #Perl prefixes its variables with a dollar sign, so we will add that here
               $procedure =~ s/VAL/\$VAL/g;
               #Print out the array declaration
               print "${type} ${table_name} [] = {\n";
               #Go through each array element
               foreach my $VAL (0 .. $end_idx)
               {
                  #Only process an answer if we have reached our starting index
                  if($VAL >= $start_idx)
                  {
                     #evaluate the procedure specified (this sets $@ if there are any errors)
                     $result = eval $procedure;
                     die("Error processing: $@") if $@;
                  }
                  else
                  {
                     #if we haven't reached the starting index, just use the default
                     $result = $default;
                  }
                  #Print out the value
                  print "\t${result}";
                  #If there are more to be processed, add a comma after the value
                  if($VAL != $end_idx)
                  {
                     print ",";
                  }
                  print "\n"
               }
               #Finish the declaration
               print "};\n";
            }
            else
            {
               #If this is not a macro invocation, just copy the line directly to the output
               print $line;
            }
          }

To run this program, do this:

Running the code generator

          ./tablegen.pl < sqrt.in > sqrt.c
          gcc sqrt.c -o sqrt
          ./a.out

Mini-language for boiler-plate

If you have a large application where many of the functions include a lot of boilerplate code, it is often a good idea to create a mini-language that allows you to work with your boilerplate code in an easier fashion. This mini-language will then be converted into your regular source code language before compiling.

The following is an example:

Let's say that we are building a CGI application consisting of many independent CGI scripts. In most CGI applications, much of the state is stored in a database, but only a session ID is passed to each script via a cookie.

However, in nearly every page we need to know the other standard information (such as the username, group number, the current job being worked on, whatever else information is pertinent). In addition, we need to redirect the user if they do not have an appropriate cookie.

           (define (handle-cgi-request req)
             (let (
                    (session-id (webserver:cookie req "sessionid")))
               (if (not (webserver:valid-session-id session-id))
                   (webserver:redirect-to-login-page)
                   (let (
                         (username (webserver:username-for-session session-id))
                         (group (webserver:group-for-user username))
                         (current-job (webserver:current-job-for-user username)))
                      ;;Code for processing goes here
                      ))))

While some of that can be handled by a procedure, the bindings certainly cannot. However, we can turn most of it into a macro. The macro can be implemented like this:

           (define-syntax cgi-boilerplate
             (lambda (x)
               (syntax-case x ()
                (
                  (cgi-boilerplate expr)
                  (datum->syntax-object
                    (syntax k)
                     (let (
                           (session-id (webserver:cookie req "sessionid")))
                          (if (not (webserver:valid-session-id session-id))
                              (webserver:redirect-to-login-page)
                              (let (
                                     (username (webserver:username-for-session session-id))
                                     (group (webserver:group-for-user username))
                                     (current-job (webserver:current-job-for-user username)))
                                     (syntax-object->datum (syntax expr))))))
                 )
            )))

We can now create new forms based on our boilerplate code by doing the following:

           (define (handle-cgi-request req)
             (cgi-boilerplate
              (begin
                ;;Do whatever I want here
                )))

In addition, since we are not defining our variables explicitly, adding new variable definitions to our boilerplate won't affect its calling conventions, so new features can be added without having to create a whole new function.

In any large project, there are inevitably templates to follow which cannot be reduced to functions, usually because of the bindings being created. Using boilerplate macros can make maintenance of such templated code much easier.

Likewise, other standard macros can be created which make use of variables defined in the boilerplate. Using macros like this significantly reduces typing because you do not have to constantly be writing and rewriting variable bindings, derivations, and parameter passing. This also reduces the potential for errors in such code.

Realize though that boilerplate macros are not a panacea. There are many significant problems that can occur, including:

  • Accidentally overwriting bindings by introducing a variable name that was previously defined in a macro.
  • Difficulty tracing problems because the inputs and the outputs of the macros are implicit, not explicit.

These can be largely avoided by doing a few things in conjunction with your boilerplate macros:

  • Have a naming convention which clearly labels macros as such, as well as indicate that a variable came from boilerplate code. This could be done by affixing -m to macros and -b to variables defined within a boilerplate.
  • Carefully document all boilerplate macros, especially the introduced variable bindings and all changes between versions.
  • Only use boilerplate macros when the savings in repetitiveness clearly outweigh the negatives of implicit functionality.

Abbreviate statements and prevent mistakes

A lot of programming languages make you write really verbose statements to do really simple things. Code-generating programs allow you to abbreviate such statements and save a lot of typing, which also prevents a lot of mistakes because there is less chance of mistyping.

Metaprogramming Framework in Java

Reflection

Reflection is the ability to introspect metalevel information about the program structure itself at runtime. Usually this metalevel information is modeled using the general abstraction mechanisms available in the language. In Java, reflection enables to discover information about the loaded classes:

  • Fields,
  • Methods, and
  • Constructors
  • Generics information
  • Metadata annotations

It also enables to use these metaobjects to their instances in runtime environment.

The following is the Java metalevel architecture

Generics

Metadata annotation

Limitations