CSC/ECE 517 Fall 2011/ch1 1d sr

Wiki Chapter: CSC/ECE 517 Fall 2011/ch1 1d sr

Introduction

Closures are a topic of interest in computer science. While they provide useful functionality, they are more often included in dynamic languages than in statically typed languages. This article covers what closures are and why they are useful. It also explains why implementing closures in statically typed languages is a challenge, giving examples along the way from both dynamically typed and statically typed languages.

Closures

What are Closures?

A closure is basically a function or a method (or a block, in the context of ruby) that has the following two properties:

It is a first-class function, meaning it can be passed around like an object or as a value parameter to other functions/methods/blocks
It saves its lexical environment, meaning it captures the variables within scope at its creation and maintains them even if they later go out of scope.[1]

In order for a programming language to be able to support closures, it must support the notion of "first class functions." A first class function is a function that can be treated like an object, such as passing it as a parameter or returning the function as the result of another method. [2] Since a closure can be passed around as a value before calling it, it can be called in a completely different scope than the one in which it was created and it is because of this that it needs to retain the knowledge of the lexical environment in which it was defined. [3]

Why Closures

For functional languages, which themselves are essentially stateless, closures offer a way to store some kind of state at least for as long as the closure lives. [4]
Closures help functional languages to be terse in expressing logic. Closures also offer a very succinct way of performing some neat programmatic operations i.e. if a closure modifies the value of a variable, it will retain the new value the next time the closure was invoked. [5]
Higher order, special purpose functions like select and inject in Ruby, which do a lot of useful stuff with very little code are possible only because the language supports closures.
Currying of functions in languages like Ruby is only made possible through the use of closures. Currying is taking a function and one or more of its parameters to produce a new function with fewer parameters, which can be useful for human readability and coding.[6]
For procedural or imperative languages, the argument for closures is a twofold one. Procedural languages already have mechanism to store state via the use of global variables, static variables etc. but closures offer a way to pass around units of computation that can be executed later. The function pointers and the callback function ideology in C is centered around this motivation. For object-oriented imperative languages like C++ or the Objective-C, closures provide for the lack of a syntactically light-weight way to define simple object functions for the effective use of several generic algorithms like those offered by Standard Template Library.

Closures in Dynamically Typed Languages

A dynamically typed language is a programming language which does majority of its type-checking at run-time instead of compile-time [7]. In dynamic typing, values have types but variables do not. Hence, a variable can refer to a value of any type. This important property of dynamically typed languages offers itself as a convenient way of implementing and supporting closures. For closures, a variable can point to a block of code and the associated lexical environment. A special-purpose programming construct called "lambda" is used in some languages to express anonymous functions i.e functions which are not bound to a name at runtime. Lambda is used to generate new functions dynamically. The concept of "lambda" has been derived from the lambda calculus in the functional programming [8].

Example use of Closures in Dynamically Typed Langauges

The following are examples of closure usage in some of the dynamically typed languages. These examples cover a particular and common usage of closures, but all the aspects and subtleties of closures in a particular language (in Ruby, for example, there are Seven distinct ways of implementing closure or closure-like structures[9]) will not be covered here.

Example in Ruby

In Ruby, for all intents and purposes, "Blocks" [10] are closures. They are closed with respect to the variables defined in the context in which they were created, regardless of the context in which they are called. The subtle difference is that a "Block" can not be passed as a specific named parameter and "yield" is the only way by which control can be given to the block that was passed to the method in which yield is called.

class ClosureTest
 
   def initialize(x)
      @x = x
   end
 
   def call_closure(ClosureBlock)
     ClosureBlock.call @x
   end
end
a = 100
ClosureBlock = lambda {|x| puts x+a}
ClosureObj = ClosureTest.new(10)
ClosureObj.call_closure(ClosureBlock)      # This will output "110"
a = 200
ClosureObj.call_closure(ClosureBlock)      # This will output "210"

As seen in the above example, "ClosureBlock" is a closure. The use of lambda construct means that the ClosureBlock variable would be of type "Proc". When ClosureBlock is defined, the variable "a" is in its scope and hence it will remember the value that "a" was bound to. However when ClosureObj, an object of class ClosureTest calls "call_closure" method with ClosureObj passed as a parameter (closures can be passed around as objects), it will remember the value of "a" even though "a" is now no longer in scope and will print "100+10" as the answer. On the similar notes, when "a" is modified and call_closure is invoked again, the ClosureBlock will pick up the value of "a" in the lexical context in which ClosureBlock was created i.e. it will pick up the latest value of "a".

Example in JS

JavaScript is also a dynamically typed language and it supports closures in its most rudimentary form. It is not required to use lambda construct to be able to use closures in JavaScript. The following is an example of closures in JavaScript:

  function calledFunction(passedFunction) {
         passedFunction("newValue");
  }
  function clickMe() {
       var value = "originalValue";
       alert(value);
       calledFunction(function(externalValue) {
           value = externalValue;
       });
       alert(value);
  }

In function "clickMe", the output of 'alert' would be "originalValue". The function block gets created on the fly while calling "calledFunction" in "clickMe". The passed function is a closure and it remembers the value of "value" that was within the context of its definition. So when "passedFunction" is called from calledFunction, it will assign the "newValue" to "value" even though "value" is not in the scope of calledFunction. This is possible because of closures. The output of next 'alert; would be this "newValue" [11]

Example in Python

For a highly powerful and dynamically typed language like Python, closures act as one more useful feature. Below is a simple example of closures in Python:

def counter(start=0, step=1): 
     x = [start] 
     def _inc(): 
          x[0] += step 
          return x[0]
     return _inc 
c = counter() 
c() # This will output "1"
c() # This will output "2" 
c() # This will output "3"

The closure names "_inc" will remember the value of variable "step" which was in scope when it the closure function was defined. [12]

Closures in Statically Typed Languages

Challenges For Closures in Statically Typed Languages

Closures are not usually implemented in statically typed languages, due to their nature. The following are some obstacles statically typed languages must overcome:

While dynamically typed languages allow for any variable to be associated with any type, static languages require types declared at compile time. This would require a specific type for closures if they were implemented in a statically typed language.[13]
Many statically typed languages keep track of in-scope variables with a stack-based system: variables are pushed on when entering a scope and popped when exiting a scope. This can be a challenge for implementing closures in statically typed languages, since closures must maintain access to variables even when they go out of scope.
Since closures require saving the lexical environment of its definition, the ways and means provided by a language for doing this determines the level of difficulty and effectively the feasibility of implementing closures in that language. In statically typed languages like C, one way of doing this would be to make a copy of all the variables that are needed when a closure was defined and passing this copy around as its context. Alternative to this copying strategy, one way could be to retain a reference to them, thus making them not eligible for garbage collection. [14]

Implementing Closures in Statically Typed Languages

The following sections discuss specific statically typed languages in relation to closures.

C : function pointers

The concept of closures requires that a block of code along with the lexical context of its creation needs to be saved. C's context is based on the local variables which are created on the stack frame corresponding to the function currently being executed, the registers of the CPU which might store specific values and the global variables. Anything other than this is handled by the dynamic memory allocation provided by malloc.

The programming construct of "function pointers" is a useful tool for implementing closures in C. A function pointer is, as the name suggests, a pointer which stores address of any function. One can pass this function pointer holding the address of a particular function to another function as a parameter. The function which receives the function pointer as a parameter can now "call" or execute the function pointed to by the function pointer. This, in its most rudimentary form, is the basic technique by which C can pass along blocks of code to another functions. [15]

However the definition of closure also mandates that the lexical context at the time of its creation need also be saved and accessible later. This proves to be a painful task and one needs a logic to provide this functionality in the most concise and acceptable manner. One way to do this would be to provide a programmer defined struct which holds the values of all the variables which were "in scope" of the function that is to be treated as closure.

Note here that the struct variable which stores the values of all the variables in the lexical context of the function under consideration must not be "statically" allocated i.e. it must not be allocated on the stack of the method currently being executed. So this context saving structure variable needs to be allocated from the heap using malloc and the pointer to this "context structure" will need to be passed along with the function pointer to any functions that need to execute this later. However, note that because of the inherent nature of "statically" typed behavior, one would have to associate a particular function pointer with the function with matching signature and hence this combination of function pointer and user data struct pointer is not exactly type independent. [16]

"Type independence" in this context means that a function block would have to be closely tied with its signature like the data types of the parameters that the function expects, the data type of the return value that the function should return etc. So to create a different closure corresponding to a function of different signature, one would need the definition of another function pointer with compatible signature. [17]

An example implementation of Closures in C

Consider the below Ruby block. The function "caller" defines a closure using lambda construct and returns it to a variable called ClosureVar. The variable "x" which is passed as a parameter to function "caller" forms the part of lexical environment that the closure should save and keep track of.

 def caller x
      puts x
      lambda do
          x += 1
          puts "Inside closure"
          puts x
      end
  end
  ClosureVar = caller(5)
  ClosureVar.call             # This will output 6
  ClosureVar.call             # This will output 7

The below structure would be the placeholder for the closure implementation in C. Since for this example, "x" is the only variable which is part of the lexical context in which closure was defined, only that variable is included in the struct. If more variables are part of it, they would need to be included in the struct definition as well. [18]

 struct closure {
       void (* call_closure)(struct closure *);
       int x;
 };

Below is the definition of the actual function which will hold the statements that were part of the block of code that constitutes the closure in the above ruby snippet.

 void closure_block(struct closure * env) {
       env->x += 1;
       printf ("block: x is %d\n", env->x);
 }

And finally, the C code snippet below would be the implementation of "caller" function.

   void (*fptr_closure_block) (struct closure *) = 0;       /* Define a function pointer which can point to functions whose
                                                               signature matches that of closure_block.
                                                             */
   fptr_closure_block = closure_block;                      /* Make this function pointer point to closure_block */

   struct closure * caller(int x)
   {
        struct closure * ClosureVar = (struct closure *)malloc(sizeof(struct closure *));
        ClosureVar->x = x;
        printf ("x is %d\n",ClosureVar->x);
        ClosureVar->call_closure = fptr_closure_block;
        return ClosureVar;
   }

As can be seen from the above example, implementing a simple closure scenario in C involves a lot of messy logic and function pointer manipulation. For implementing more advanced feature like multiple closures, an even more complicated code would be required. It is thus, not at all, intuitive to implement closures in C.

C++ : Function Objects

Implementing closures in C++ requires a different philosophy because of the Object Oriented nature of the language. Just having a struct of function pointer pointing to the function to be used as closure and a pointer to the context of the function will not suffice. In C++, it is not possible to pass around pointer to a member function. In C++ world, a member function is part of the context of the object and hence can not be accessed when the object ceases to exist or goes out of scope. On the inside, calling a member function is equivalent to calling a piece of code i.e. the function with a hidden argument "Object *this" which is actually a reference to the object instance [19]. A specific C++ program construct called as a "functor" of function object can be used to implement an equivalent of closures.

Example implementation using Function Objects

Function objects are objects specifically designed to be used with a syntax similar to that of functions. In C++, this is achieved by defining member function operator()(). In other words, a functor or a function object would be the object of a class that defines the function call operator i.e. "operator () ()" as a member function. This will mean that using the object with "()" would be equivalent to calling a function. Also the data members of this class can be thought of as the variables that constitute the context of the function. This example is cited from this discussion thread. about closures in C++

Consider the below example. Whenever you create the object "Adder10" and actually do the call of "Adder10" with argument "20", it will add 20 to 10 and return the result. The private data member "x" is set once via the constructor at object initialization. Whatever is passed as the argument will be added to the value of "x" and be returned.

class ClosureAdd {
 ClosureAdd(int x) : x(x) {}
 int operator()(int y) { return x + y; }
 private:
    int x;
};
// Now you can use it like this:
ClosureAdd Adder10(10);                 // create a functor
int i = Adder10(20);                    // and "call" it
cout << i                               // this will print "30" as the answer

Java : anonymous inner classes

Java can provide similar functionality to closures through anonymous inner classes.

Anonymous inner classes are unnamed classes that are simultaneously defined and instantiated with a single new operator. The following define the syntax for anonymous inner classes: [20]

   new class-name ( [ argument-list ] ) { class-body }

   new interface-name () { class-body }

The following is an example closure implementation in Java from Ricky's Technical Blog:

  public void example()
  {
          final double use=Math.random()*10000;
  
          SwingUtilities.invokeLater(new Runnable()
          {
                  public void run()
                  {
                          System.out.println(use);
                  }
          });
  }

The above is a closure in that the method 'run' inside the new 'Runnable' object can reference the variable 'use'.

Anonymous inner classes have limitations as closures. For simulating closures, the local variables from enclosing scopes used in the anonymous class must be declared final. This is because variables are not resolved in the enclosing scope, but in the class's scope. Using Java's anonymous inner classes in this way is also not as simple to implement code-wise and overly verbose in comparison to the use of closures in dynamically typed languages like Ruby.[21]

James Gosling, regarded as the inventor of Java, stated that "closures were left out of Java initially more because of time pressures than anything else." He also explains inner classes being added to Java to alleviate the lack of closures, but that this was not a real solution. Gosling feels he should have gone all the way to implementing closures back then. [22] Proposals are currently being made for Java to include closures as a feature. [23]

Closures and Static Scoping

Static scoping is a feature of a programming language that makes closures possible for statically typed languages that use it. Static scoping, also known as lexical scoping, is scoping in which variables are looked up according to the context in which they are defined. This ensures that the lexical context is maintained in languages that allow for first class functions, which means the first class functions create closures. [24] Examples of such languages include JavaScript and Scheme, both of which are statically typed. Scheme, a variant of Lisp, was the first language to fully support closures [25], due in part to its static scoping.

Consider the below Scheme code snippet. The example is derived from this thread.

 (let* ((foo                     ; define foo
           10)
      (incfoo1                 ; define incfoo1
           (lambda () (+ foo 0.1)))
      (bar                     ; define bar
           65))                ; bar not used (just for symmetry)
    (incfoo1))                 ; prints 10.1

The variable "foo" is in static scope when the function "incfoo1" is defined. Later, when incfoo1 is called, it will print "10.1", thus remembering the value "10" that foo was assigned to, even though it is no longer in scope at the time of the call.

Conclusion

As has been seen, closures are a useful programming construct that can be taken advantage of by programming languages that allow it. These languages are usually dynamically typed, since implementation of closures in statically typed languages is challenging or overly verbose. Despite the challenge, it is still possible to have closures in statically typed languages, such as with static scoping. Some of the more popular statically scoped languages, such as Java, hope to add options for closures eventually.