CSC/ECE 517 Fall 2011/ch1 1d gs

Closures for Statically Typed Languages

Introduction

This wiki chapter gives an introduction to language constructs called closures. It explains their usage and discusses the challenges involved in implementing them in statically typed languages.

Closures

A closure is a kind of routine that can be assigned to a variable or passed as a parameter to another routine. It can access the local state (local variables, parameters, methods, and instance variables) which is visible in the place it was defined.

To rephrase, a closure is a block of code which meets the following three criteria.

1. It can be passed around as a value.

2. It can executed be on demand by anyone who has access to it, at any time.

3. It can refer to variables from the context in which it was created.

Why do we need closures and what are its uses?

DRY(Don't Repeat Yourself) is a popular software development principle, formulated by Andy Hunt and Dave Thomas, which stresses the importance of not duplicating code. Closures help in implementing the DRY principle and make the code easy to maintain. Closures considerably increase the level of a language by mixing access to local variables with remote execution of a set of locally-deﬁned statements.

Lets see an example,

 def paidMore(amount)
 return Proc.new {|e| e.salary > amount}
 end

This function returns a closure, the behavior of which depends on the parameter "amount" passed to the enclosing function.

Amount can be fixed to a value like below, the variable highPaid contains a block of code (called a Proc in Ruby) that will return whether an employee’s salary is greater than 150.

 highPaid = paidMore(150)
 //Let’s check john’s salary
 john = Employee.new
 john.salary = 200
 print highPaid.call(john)

The expression highPaid.call(john) executes the ( e.salary > 150 ) block and prints the result of the execution. As long as a closure lives, all the free variables accessed by it are not eligible for garbage collection. So, the variable amount persists until the closure returned by paidMore does. Hence,from the above sample code it is clear that even if the value 150 went out of scope, at the time of issuing the print call ,the binding would still remain.

The art of getting the best possible results by minimal code and the ease with which a user can use closures makes the latter a big success in Ruby.

A function to determine if an employee is a manager.Using C#, I'd probably write it like this.

public static IList Managers(IList emps) {
   IList result = new ArrayList();
   foreach(Employee e in emps)
     if (e.IsManager) result.Add(e);
   return result;
 }

In a language that has Closures, in our case (Ruby), I'd write this.

 def managers(emps)
   return emps.select {|e| e.isManager}
 end

Statically Typed vs Dynamically Typed Languages

A statically typed language performs type checking during compile-time. They include languages like Java , Objective-C ,Pascal etc.Static typing includes earlier detection of programming mistakes (e.g. preventing adding an integer to a Boolean), better documentation in the form of type signatures (e.g. incorporating number and types of arguments when resolving names), increased runtime efficiency (e.g. not all values need to carry a dynamic type), and a better design and development experience (e.g. knowing the type of the receiver, the IDE can present a drop-down menu of all applicable members).

A dynamically typed language is when the majority of its type checking is performed at run-time as opposed to that of compile-time. In dynamic typing values have types, but variables do not (a variable can refer to a value of any type) .These include Ruby,Smalltalk , Python ,JavaScript etc. Dynamically typed languages are indispensable for dealing with truly dynamic program behavior such as method interception, dynamic loading, mobile code, runtime reflection, but dynamic typing generates run time type errors,and it requires more testing. In dynamically typed languages, the method look up happens at run-time which is known as late binding and it allows objects to dynamically alter their behaviour, allowing greater flexibility in the manipulation of objects.

An important fact to note here is that the presence of static typing in a programming language does not necessarily imply the absence of all dynamic typing mechanisms. For example, Java and some other ostensibly statically typed languages, support downcasting and other type operations that depend on runtime type checks,which is a form of dynamic typing.

Closures in Dynamically Typed Languages

Closures in Ruby

Ruby is one of the languages that provides a very good support for closures. It has four different ways to define and use closures.

Scheme Closures and Scoping

Scheme is a statically scoped dialect of the Lisp programming language invented by Guy Lewis Steele Jr. and Gerald Jay Sussman. It is primarily a functional programming language with very simple syntax based on s-expressions, parenthesized lists in which a prefix operator is followed by its arguments..

Scheme has same scoping as in ruby. Lets see how closures are implemented in scheme

     ; Defines and applies function func using free variable a
      (let* ((a                     ; define a
           1)
      (func                 ; define func
           (lambda () (+ a 0.1))) ; line# 4
      (func))                 ; prints 1.1

The closure on line #4 accesses the free variable a and increments its value by 0.1

   (let* ((a      1)
      (func1   (lambda () (+ a 0.1)))
      (func2    (let* ((a   2))  func1)); Let's change a to 2, to try to make func1 use a=2
             (func2))     ; prints 1.1 from func1

func1 is called by func2, but the value printed by func1 is not influenced by the func2's locally defined value for a.

Closures in Statically Typed Languages[A Challenge, Why?]

There are several factors that prevent pure closure constructs from being implemented in statically typed languages, we will address a few of them.

Closure like constructs in C

Consider C, a statically typed language.Closure-like constructs in C are function pointers.

  /*function pointer definition & initialization*/
    int (*ptr_2_paidMore)(int,char[])=NULL;
  /*Function definition*/
    int paidMore(int threshold, char name[])
    {
         /*Code for looking up employee name 
         Code for retrieving employee salary*/
         if(salary > threshold)
          {
          return 1;
          }
       return 0;		
    } 
    /*Assign address of paidMore to function pointer*/
    ptr_2_paidMore = &paidMore;

Now ptr_2_paidMore can be passed around and called from wherever access to it is available.

  void passPointer(int (*ptr_2_paidMore) (int, char[])
  {
       char name[]="John";
         int isPaidMore = (*ptr_2_paidMore)(150,name);
  }

However, it is very evident from the above example that, unlike its ruby counterpart, it does not have the flexibility,conciseness and cannot access variables in its lexical context. The function can only act upon the parameters that are passed to it.

Closures in Java

Java already has something close to closures called Anonymous inner classes which are known to be imperfect for a number of reasons [1]

For now the lambda project for Java Closure implementation is trying to capture all final local variables and make 'this' within the lambda expression lexically scoped. Lambda Expressions are anonymous functions, aimed at addressing the bulkiness of anonymous inner classes.

Some examples of lambda expressions:

  #{ -> 42 }
  #{ int x-> x+1 }

The method body can either be a single expression or an ordinary list of statements (like a method body). The return types of lambda are complex return types called SAM'S type [2] which make use of the imperfect anonymous inner classes.

Consider an integer array, the elements on which I would like to perform an operation and copy the contents to an output array.

For Example:

  int temp=0;
  for(i=0;i<sizeof(input);i++)
  {//Perform Some operation  on input[i] and assign to temp
     output[i]=temp;
  }

Suppose the same operation has to be performed on a float array, the code has to be repeated in a statically typed language. In Java at least objects can be manipulated as generic types, but primitive types still have the problem.

To an extent repeating the same closure for different types defeats the purpose of a closure.In a dynamically typed language like Ruby this can be accomplished using a single closure.

Java Lexical Scoping for Closures

Java does not want to permit the capture of mutable local variables. This is because it's quite difficult to write lambda bodies that do not have race conditions. Sometimes the lambda body may be passed outside the current thread and when other threads try to execute the lambda body,some race conditions may arise leaving the variables in an inconsistent state.

In short, operating with statically typed variables and assigning static types to closures’ return types creates an overhead and is cumbersome for a pure closure implementation in statically typed language.

Return from Closure

In ruby, the following function ex2 passes a closure to function ex3. The Closure computes the value of x and returns it.

Considering the following example:

def ex2
   y=2
   ...
  closure=lambda { x = y + 2 }
   ...
  ex3(closure)
   ...
 end

If the same has to be achieved in C/Java, the return statement will be used which would return from the function ex2 and not execute the remaining statements within ex2.

In Ruby a closure iterates through a list of employees and returns either their names, locations, skills, salary or role depending on the parameter passed to it.The return value/values of this closure change at runtime depending on the parameter passed to it.

Writing a similar closure in C/Java would require different closures for different return types which obviously makes it a cumbersome task.

Safety

In C, even if a nested function (platform specific, supported by GCC) is written, it cannot access the local variables in the outer function when the function exited.

Consider the following example:

hack (int *array, int size)
    {
      void store (int index, int value)
        { array[index] = value; }
    
      intermediate (store, size);
    }

Here, the function "intermediate" receives the address of the function store as an argument. If now intermediate calls store, the arguments given to store are used to store into array. But this technique works only so long as the containing function (hack, in this example) does not exit. Also note, if array was a global variable and the above example is rewritten as below , it would work fine.

 int *array;
 hack( int size)
 {
        void store (int index, int value)
        { array[index] = value; }
    
        intermediate (store, size);
 }

These are not strictly type-errors but it goes against the safety principles of a statically typed language.

Closures in C#

C# has gone a long way to implement closures. Lets compare it with ruby. Delegates in C# are used to reference methods, they are similar to function pointers in C++. Delegates can be used with any method that have the same parameters and return value.

 A delegate in C#:
 delegate int ExampleXY(int x,int y);

Any method that has the same signature as the delegate declared above can be used with it, like:

 int add(int a,int b)
 {
   a+=b;
   return b;
 }
 //Assigning a method reference to delegate
 ExampleXY ex=add;
 Console.WriteLine(ex(1,2));

Delegates in C# can refer to the free variables in their lexical context. As far as delegates are concerned, the signature of the delegate and the method it references has to be the same.This will lead to code duplication for different data types, if "add" has to be written for other numerical data types. To solve this, C# has Generic delegates.

 Generic Delegate:
 delegate T ExampleXY<T>(T item);

"add" when written as a generic method:

 T add<T>(ref T a, ref T b)
 {
   return a+b;
 }

Assignment to delegate and invocation:

 ExampleXY<int> ex_int = add;
 ExampleXY<float> ex_float = add;
 Console.WriteLine(ex_int(1,2));
 Console.WriteLine(ex_float(1.3,5.7));

Now, we have two references to the same method, for two data types - int and float.

C# also has lambda expressions like ruby of the following format:

 parameter => {execution statements}

Example:

 x => x+1

Assignment of lambda expressions to generic delegates:

 Func<int, int, int> add_int = (a,b) => a+b ;
 Func<float,float,float> add_float = (a,b) => a+b ;
 int result1 = add_int(1,2);
 int result2 = add_float(3.0,5.0);

Here again we have to duplicate code for different data types.

A ruby variant of the above "add" example

 x = lambda { |a,b| a+b }
 x.call 10,20
 x.call 2.5,6.7

The C# example works very similar to ruby closures. The closure written in ruby above works for all objects on which the '+' operation is valid unlike C#, which requires separate delegates for different data types. Ruby's closures implementations have a simple syntax and are very elegant compared to C#.

Conclusion

From the above discussion, we can conclude that, closures are implemented best in dynamically typed languages. Some statically typed languages which are trying to implement closures use dynamic binding. However, the syntax and redundant code used to implement closures do not make them a compelling option over dynamically typed languages.