More experimenting with closures and iteration led to some interesting results. Due to some quirks with Ruby's scope rules there are a few gotcha's and some of my initial assumptions about how variables are bound were wrong. Hopefully this entry will serve to enlighten those who, like me, wondered about the mystical Ruby closure.

What are closures?

Closures are an stateful object created on the fly when a block of code with it's own scope becomes bound to variables in it's environment. Closures consist of two elements. The first is a reference to some executable code such a block, proc, or method. The second is the set of variables that are bound to the scope of the executable block. This code creates a closure bound to c: n = 2; c = lambda { n * n }; A closure is created any time a lambda references variables not local in scope. To be perfectly clear a closure is loosely:


struct Closure {
    void *function;
    void *function_arguments;
};

What problem do closures solve?

Closures are extremely useful for implementing callbacks in event driven systems. Most of us are familiar with Javascript's event handlers that store a function as a first class value to be called at the appropriate time. Creating a reference to a function stores not only the equivalent of a function pointer in C, but also any variables referenced by that function. This provides a mechanism to automatically restore the function's environment whenever it is called. In C this would be accomplished with an additional parameter on the callback function for state data which would be initialized when when the callback was registered. For example, several buttons might have the same callback function but to know which one was pressed the callback would need an additional parameter. If the library implementing the callback didn't provide a parameter for state one would need to register different callbacks for each button and then call the common function after some initialization. Callbacks in an object oriented language are often implemented by registering a stateful object and using setters (obj.meth = val) to explicitly initialize the environment that will exists when a particular callback method is invoked. Closures usually provide a much more elegant solution by binding the environment implicitly.

How do closures capture state?

The following code fills an array with closures that bind n from a block passed to an iterator. There are three nested scopes which I will refer to as outer, iteration block, and lambda.

  • the outer in which closures is defined
  • the iteration block in which n is defined
  • the lambda in which n is bound to the lambda

closures = []
(0..7).each { |n|
  closures << lambda { n }
}

# Closures are bound to their individual copy of n.
n => NameError: undefined local variable or method ...
closures.map { |c| c.call } => [0, 1, 2, 3, 4, 5, 6, 7]
n = 3
closures.map { |c| c.call } => [0, 1, 2, 3, 4, 5, 6, 7]

From the above it is apparent that each lambda holds it's own copy of n since n is both distinct and unaffected by changes to the outer n. This happens because the iteration block n goes out of scope after each pass leaving only a lambda scope binding to n. Each successive pass through the iteration block yields a newly scoped n that does not affect previously bound lambdas. Each closure has saved it's own copy of state.

How does scope affect the closure's binding to n?

The next piece of code is the same as the first with one change; n is defined in the outer scope. The lambdas behave very differently as a result.


# Here we introduce n in the outer scope
n = 0

closures = []
(0..7).each { |n|
  closures << lambda { n }
}

# Closures are bound to the outer n
n => 7
closures.map { |c| c.call } => [7, 7, 7, 7, 7, 7, 7, 7]
n = 3
closures.map { |c| c.call } => [3, 3, 3, 3, 3, 3, 3, 3]

Now, since n was defined in the outer scope, each iteration only modified the value of the outer n. Closures are therefore bound to the same n instead of their own copy, and changing n changes the n in every closure. The first example, where each lambda bound it's own n, is analogous to instance variable state. The second example, where there is only one copy of n, behaves more like a static class variable in some other languages.

Look out for the for loop!

Ruby's for loop does not have it's own scope. It brings a variable into scope on the first pass and leaves a reference hanging around for later. The following code has only two scopes; outer and lambda. n is introduced into the outer scope on the first iteration and remains defined even after the loop terminates, despite it's being introduced as part of the iteration construct.


closures = []
for n in 0..7
  closures << lambda { n }
end
n => 7
closures.map { |c| c.call } => [7, 7, 7, 7, 7, 7, 7, 7]

How to learn more

I am convinced that the only way to learn to program is to make something that doesn't work and then fix it. To understand closures it is necessary to understand the problem they solve. If you're like me and you read too much and program too little it's no wonder many things seem out of reach. At least I still get a spiritual feeling of enlightenment and enthusiasm as a result of finally beginning to understand. This gospel compelled me to write and so I have.

1 Response Follows

  1. peter says

    I am becoming obsessed with Ruby. I appreciated your efforts. I coded your examples 1 by 1 and tried slight variations, good stuff. I am also going through this file, this guy went all out: http://innig.net/software/ruby/closures-in-ruby.rb


Your Response