Anonymous Functions and Overwriting Arguments

I’ve stumbled upon the following (in my mind) weird behavior:

If you define a named function that returns a function f1 based on a parameter a, the returned function stays the same, even if a gets assigned another value later on.
However, if you define the same function directly as an anonymous function depending on a, the function changes, when the value of a changes.

Illustrative example:

function fun(a)    
    x -> x + a
end

a = 2
f1 = fun(a)
f2 = x -> x + a

@show f1(0)  # returns 2
@show f2(0)  # returns 2

a = 3

@show f1(0)  # still returns 2
@show f2(0)  # now returns 3 !!!

Why do those variants work differently?

They work differently because a is passed by value to the function factory fun, so that f1 has a=2 hardcoded in its storage as a field. Meanwhile, f2 knows that it needs to query the global variable name Main.a every time. This can be seen using the following introspection tools:

julia> @code_lowered f1(0)
CodeInfo(
1 ─ %1 = Main.:+
│   %2 = Core.getfield(#self#, :a)
│   %3 = (%1)(x, %2)
└──      return %3
)

julia> @code_lowered f2(0)
CodeInfo(
1 ─ %1 = Main.:+
│   %2 = (%1)(x, Main.a)
└──      return %2
)

julia> dump(f1)
#13 (function of type var"#13#14"{Int64})
  a: Int64 2

julia> dump(f2)
#15 (function of type var"#15#16")
4 Likes

Thanks! That clears up a large part of my confusion.

But what is the reason behind this design decision in Julia?

Passing an argument acts just like an assignment, in Julia and most other programming languages. It’s no different from:

julia> a = 3
3

julia> b = a
3

julia> a = 4
4

julia> b          # still == 3!
3

This is a common point of confusion for new programmers. See the section on assignment vs. mutation in the manual, or blog posts like The Map Is Not the Territory.

There’s a difference because you only access and reassign the global variable a for f2, not fun’s local argument a for the call that instantiated f1. The local argument a is completely separate from the global a after the fun call assigns input values to its arguments; you’d get the same behavior if you renamed fun(b) = x->x+b. The local argument a is also different across different calls, so a second f3 = fun(3) creates a different local a to be captured by a different closure. The separation between temporary local and persistent global variables is present in most programming languages, and the ones with closures will also show a different behavior from functions that only access a global variable.

You can in fact reassign the local variable a captured by closures, whether it’s inside or from outside the associated fun call.

julia> function fun(a)
           a = zero(a)
           (y) -> (a = y), x -> x + a
       end
fun (generic function with 1 method)

julia> reassign, add = fun(98592987)
(var"#1#3"(Core.Box(0)), var"#2#4"(Core.Box(0)))

julia> add(8), add(15)
(8, 15)

julia> reassign(100); add(8), add(15)
(108, 115)

Note that reassigning a captured variable adds overhead, and often more than what is necessary because the lowerer and compiler can’t coordinate on type inference yet (Core.Box), so people tend to capture a typed RefValue to be mutated instead, if needed.

Well, it’s more like let a = 3 in that it introduces a new binding, i.e., within the function the argument is bound to the passed value no matter whether a variable of that name exists already or not. Assignment is somewhat more complicated in that it sometimes introduces a new variables and sometimes not (soft/hard scope).

@phK3 The behaviour in your example is nowadays standard in most programming languages and called lexical scope:

function fun(a)
    # Inside the function a new binding for a is established
    x -> x + a  # The free a in this function refers to that binding, i.e., it is a so called lexical closure
end

# Same example with let
f3 = let a = 2
    x -> x + a  # refers to the local a here!
end

a = 3  # global a, cannot change the a introduced by let above
f3(0)  # returns 2

The alternative (that you expected for f1 as well?) is called dynamic scope and is still available
in some programming languages, most prominently Common Lisp. For dynamic scoping in Julia check ScopedValues (which cannot be bound via function arguments though, i.e., only the let example would work for them).

2 Likes

I try to offer a more elementary explanation here.

When I first learned programming eons ago, the language was called Simula, the lecturer described a variable as a label ingrained on a box which one could put stuff into. Like

a = 3
f(a)
a = 4

And it would always be the same box, i.e. located at the same place in memory. So the 4 replaces the 3 in the box. While this was approximately correct for Simula, and for some other languages, it’s not a very useful way to think of variables in julia.

It’s somewhat better to think of variables as labels which you move around. So, first you put the label a on the value 3. Then you give this value to the function f (not the label, the value gets a brand new label inside the function, the formal argument). After f returns, you reuse the a-label for the value 4. I.e. you don’t store the value 4 where previously there was a 3.

The same goes for more complicated values, like vectors. You can reuse the label again:

a = [1, 2, 3, 4]
b = a

Now, b is another label, it’s a label on the value of a, that is, on the same vector.
Now, since a labels a vector, and vectors are “mutable”, i.e. you can change their content, you can update the content of that vector, with e.g.

a[2] = 42

and b[2] will now be 42, since b is just another label for the same vector. You can of course reuse b as well:

b = [2,3,4,5]

and it is now unrelated to a. You have reused the label b for a new vector. So b[2] is now 3, wheras a[2] is still 42.

In your original example, you used a global a inside your f2 function. There is no other a visible inside f2, if you reuse it after you created the f2 function, it is still the same label, the one you use inside f2 and the global one. Called Main.a in gdalle’s lowering magic above.

On the other hand, the f1 function refers to the a inside fun, that is a new “label” for the value you passed to fun, i.e. 2 (Every time you call fun you get a brand new a in there, unrelated to the a from previous calls). That is a “label” which is not visible outside fun (well … not entirely true, you can actually access it as f1.a, but that’s not documented), so it’s not affected by what you do outside fun.

Your example has the same effect as

function fun(b)    
    x -> x + b
end

a = 2
f1 = fun(2)
f2 = x -> x + a

@show f1(0)  # returns 2
@show f2(0)  # returns 2

a = 3

@show f1(0)  # still returns 2
@show f2(0)  # now returns 3 !!!

which may be easier to understand