2010-07-14 Lambda variable scope & execution confusion

Published on July 14, 2010

I've been conducting interviews over the last couple weeks and have noticed a general lack of understanding regarding Lambda expressions (even in those candidates with senior role titles). So here's some information about Lamdas that if you don't already know, you should!

First of all, Lambda expressions are just inline methods. Unlike delegates, lambdas have access to variables that are scoped the same as the lambda. So if you have a method with a variable v and a lambda l, the statements inside l will have access to v as in the following example:

void SomeMethod() {
    int v = 1;
    Func<int> l = () => { return v + 1 };
    ...
}

simple enough, right? Except, and here's what I believe separates the good from the great, do you know how that works? Think about it; in that example, l is a method and it doesn't take v as a parameter, so v should be out of its scope. Like it would be with a delegate. But we already said it's not. And the reason has to do with a cute compiler trick which pulls out the necessary variables and tacks them onto their own class. This is called variable lifting. The lambda expression is also tacked onto this class, so it has access to these variables (which otherwise would be out of scope).

The reason the how-it-works matters is all in the side effects. Now that you know how it works you should be able to figure out what the output would be from the following snippet:

int i = 1;
Func<int> f = () => { return i; };
i = 10;
Console.WriteLine(f());

Did you get it? The output is 10. Because by the time the Func is called, i was set to 10. Since i is lifted into its own class, the same i is changed a line after the initialization of f. So by the time we execute f, i is 10.

Consider the difference this makes for the following code:

string[] strings = {"this is a sentence", "here's another one", "this should be plenty of sentences", "we're probably done here" };
var query = strings.AsQueryable();
foreach(var s in new string[] {"s", "p"}) {
    query = query.Where(x => x.Contains(s));
}
foreach (var item in query) {
    Console.WriteLine(item);
}

Most developers I've spoken to assume that the query will have 2 predicates (where contains “s” and contains “p”) so the output will be:

this should be plenty of sentences

but in fact, the output is:

this should be plenty of sentences
we're probably done here

because the query that's built will basically look like:

query.Where(x => x.Contains('p')).Where(x => x.Contains('p'))

since the s variable will be lifted and will change through each iteration. By the time you get around executing the query, s will be set to the last string in the array. The way around this is to create a placeholder variable inside the foreach loop like this:

foreach(var s in new string[] {"s", "p"}) {
    var localString = s;
    query = query.Where(x => x.Contains(localString));
}
...