revisiting closures
Category javascript
I've occasionally mentioned the JavaScript concept of "closures", but recently it's come up in several conversations, so I wanted to revisit it in case any of you have been curious about what this is or why you should care.
Put simply, they're a complete departure from typical variable scope handling (unless you're used to Lisp or Perl). Consider, for example, the following LotusScript function:
Public Function getWidget (Byval widgetName As String) As Widget
Dim localSomething As String
Dim myWidget As New Widget(widgetName) 'Assume this class is defined elsewhere
Let localSomething = "this is a pointless variable"
Set getWidget = myWidget
End Function
Pretty straightforward: a new instance of some Widget class is constructed using the passed argument and returned from the function. A local String variable is declared and assigned a value, but it's completely unnecessary, because as soon as the function returns the new Widget object, any variables declared inside the function are garbage collected anyway. So if, for example, Widget contains a method that refers to the localSomething variable, that method will look for the variable within the class definition and, failing that, global variables... if none exists, expect an error.
Java adds another layer of garbage collection called block scope:
public Widget getWidget (String widgetName) {
if (somethingIsTrue) { // assume a boolean defined elsewhere
Widget myWidget = new Widget(widgetName);
}
return myWidget; // oops... NullPointerException
}
In this example, myWidget is actually local to the conditional (if) block: we can't refer to it outside of that block, because it's defined inside it. To refer to it after the block, we'd actually have to define it before the block, whether or not we plan to do anthing to it inside the block.
So what makes closures different? In JavaScript, an object returned from a function still has access to variables local to the function that returned it. "Whaaa?" Yes, the garbage collector knows not to destroy those variables until the object is destroyed, because it might want to refer to them later. For example:
var Widget = function(widgetName) {
var localSomething = 'This may come in handy later.';
return {
displayLocal: function() {alert(localSomething);}
};
};
new Widget().displayLocal(); // alerts the value of localSomething
In this case, an object is created that contains a single "public" method: displayLocal. This method can safely refer to the localSomething variable, because it's bound to that variable via closure. In essence, this allows JavaScript objects to contain "private" variables and methods: if we had defined any functions within the above function - but outside of the returned object - they could be called by the returned object... but not by other code outside of the function that contains them.
In other words, this allows JavaScript objects to be defined such that they behave more like class instances in other languages: they can contain public and private members. As such, in the above example, I "instantiate" a Widget via new... but worth mentioning is that I didn't need to: each time the above function is called, two new objects are created: a local String and an object containing a public method that can refer to that String even after the function that created both has returned. Hence, whether I include the new keyword or not, a new object with the same behavior template is created, and any locally defined objects bound to it by closure are new as well: a separate instance of localSomething is bound to each object created by calling this function. I tend to use new when calling functions that are structured this way to indicate (to myself and others maintaining my code) that the returned object behaves like a class instance, but the behavior is not dependent upon that convention.
One final syntactical nugget before discussing implications: tacking parentheses onto the example function's definition makes it behave something like a singleton:
var Widget = function(){
....
}();
Widget.displayLocal(); // we no longer need to call the function each time
In this case, the function is called immediately, so the variable Widget is assigned a pointer to the object that is returned from the function, not to the function itself. Widget IS the object. It has public methods bound to private data. For quite a while, I've preferred this approach to defining global singletons, because you don't have to call a function each time... you already have the object. But I'm finding that this tends to confuse people: they'll try to instantiate the object again, which causes the code to break, because Widget isn't a function, so treating it like a function returns an invalid result. In the end, of course, it's always best to keep global variables to a minimum; Yahoo recommends creating a single singleton that represents the entire application, and nesting objects appropriately within that space, and that's what I tend to do. This seems to mitigate the confusion, as most developers then grok that there's no need to construct new copies of the object; just "use" the object as constructed in-place, and customize it as needed.
So with all of that out of the way, why are closures actually useful? The same reason private members of a LotusScript class are useful:
I've occasionally mentioned the JavaScript concept of "closures", but recently it's come up in several conversations, so I wanted to revisit it in case any of you have been curious about what this is or why you should care.
Put simply, they're a complete departure from typical variable scope handling (unless you're used to Lisp or Perl). Consider, for example, the following LotusScript function:
Public Function getWidget (Byval widgetName As String) As Widget
Dim localSomething As String
Dim myWidget As New Widget(widgetName) 'Assume this class is defined elsewhere
Let localSomething = "this is a pointless variable"
Set getWidget = myWidget
End Function
Pretty straightforward: a new instance of some Widget class is constructed using the passed argument and returned from the function. A local String variable is declared and assigned a value, but it's completely unnecessary, because as soon as the function returns the new Widget object, any variables declared inside the function are garbage collected anyway. So if, for example, Widget contains a method that refers to the localSomething variable, that method will look for the variable within the class definition and, failing that, global variables... if none exists, expect an error.
Java adds another layer of garbage collection called block scope:
public Widget getWidget (String widgetName) {
if (somethingIsTrue) { // assume a boolean defined elsewhere
Widget myWidget = new Widget(widgetName);
}
return myWidget; // oops... NullPointerException
}
In this example, myWidget is actually local to the conditional (if) block: we can't refer to it outside of that block, because it's defined inside it. To refer to it after the block, we'd actually have to define it before the block, whether or not we plan to do anthing to it inside the block.
So what makes closures different? In JavaScript, an object returned from a function still has access to variables local to the function that returned it. "Whaaa?" Yes, the garbage collector knows not to destroy those variables until the object is destroyed, because it might want to refer to them later. For example:
var Widget = function(widgetName) {
var localSomething = 'This may come in handy later.';
return {
displayLocal: function() {alert(localSomething);}
};
};
new Widget().displayLocal(); // alerts the value of localSomething
In this case, an object is created that contains a single "public" method: displayLocal. This method can safely refer to the localSomething variable, because it's bound to that variable via closure. In essence, this allows JavaScript objects to contain "private" variables and methods: if we had defined any functions within the above function - but outside of the returned object - they could be called by the returned object... but not by other code outside of the function that contains them.
In other words, this allows JavaScript objects to be defined such that they behave more like class instances in other languages: they can contain public and private members. As such, in the above example, I "instantiate" a Widget via new... but worth mentioning is that I didn't need to: each time the above function is called, two new objects are created: a local String and an object containing a public method that can refer to that String even after the function that created both has returned. Hence, whether I include the new keyword or not, a new object with the same behavior template is created, and any locally defined objects bound to it by closure are new as well: a separate instance of localSomething is bound to each object created by calling this function. I tend to use new when calling functions that are structured this way to indicate (to myself and others maintaining my code) that the returned object behaves like a class instance, but the behavior is not dependent upon that convention.
One final syntactical nugget before discussing implications: tacking parentheses onto the example function's definition makes it behave something like a singleton:
var Widget = function(){
....
}();
Widget.displayLocal(); // we no longer need to call the function each time
In this case, the function is called immediately, so the variable Widget is assigned a pointer to the object that is returned from the function, not to the function itself. Widget IS the object. It has public methods bound to private data. For quite a while, I've preferred this approach to defining global singletons, because you don't have to call a function each time... you already have the object. But I'm finding that this tends to confuse people: they'll try to instantiate the object again, which causes the code to break, because Widget isn't a function, so treating it like a function returns an invalid result. In the end, of course, it's always best to keep global variables to a minimum; Yahoo recommends creating a single singleton that represents the entire application, and nesting objects appropriately within that space, and that's what I tend to do. This seems to mitigate the confusion, as most developers then grok that there's no need to construct new copies of the object; just "use" the object as constructed in-place, and customize it as needed.
So with all of that out of the way, why are closures actually useful? The same reason private members of a LotusScript class are useful:
- Internal storage of variables that cannot be mucked with from outside the object
- Minimizing global variables by associating groups of related data with a container object
- Probably most importantly, manageable code through proper decomposition without convolution of the public API
