Referencing local variable before assignment results in unexpected behavior

Julia version: 1.6.0
I recently had a bug in my code, after isolating the bug, it came down to a piece of code that looks like this:

function outer(n)
    function inner()
        n = n + 1 
    end

    inner()
    return n
end
        
println(outer(3))
4

My impression was that outer(3) should return 3, but it actually returns 4. Somehow the code in the inner() block assigns value 4 to the variable n outside of its scope.

To me this is very counter-intuitive. Is this a bug or intended behavior?

I tried a similar piece of code in Python:

def outer(n):
    def inner():
        n = n + 1
    
    inner()
    return n

print(outer(3))

The python code results in an error:

UnboundLocalError: local variable 'n' referenced before assignment

It seems to me that Julia should throw an error like this as well. If the current behavior is correct and expected, I’d appreciate any insight as to the reason for this design.


After exploring this further, I found that Julia has completely different behavior from python when handling inner function definitions:
In Julia:

function outer()                                                                                                          
    n1 = 1
    function inner()                                                                                                          
        n1 = 3
    end                                                                                                          

    inner()
    return n1                                                                                             
end
        
println(outer())

Results in:

3

But in python:

def outer():
    n1 = 1
    def inner():
        n1 = 3
    
    inner()
    return n1

print(outer())

results in:

1

So this confirms the previous observation that inner functions in Julia actually can assign values to variables outside of its scope. Is this intended behavior?

1 Like

This is julia closures working as intended (imo very unfortunately – I’d love to see this change in julia 2.0): Not just in-place modifications, also reassignments are visible through closures. Your code gets transformed into something like

struct Outer_inner_closure <: Function
n1::Ref{Any}
end
function (instance::Outer_inner_closure)()
instance.n1[] += 1
end
function outer()
n1 = Ref{Any}()
n1[] = 1
inner = Outer_inner_closure(n1)
inner()
return n1[]
end

This is almost never what you want; and if you want it, then you should explicitly allocate a Ref (because the explicit Ref can get proper typing! And if you don’t want rigid typing you can make it a Ref{Any}.)

TLDR: Never ever assign to a closed over variable in julia. It makes code hard to read, and the compiler cannot optimize it, meaning that performance goes to hell.

edit: As evidenced by the way closures work, this was a conscious decision by the design team. My tip against reassignment from closures is an idiosyncratic style preference I stand by, but it is not an official guidance or something. Cf below discussion.

7 Likes

For me, more than half the utility of closures is exactly the fact they capture bindings this way. Without this feature closures just help you not pollute the namespace and avoid explicitly stating a long list of parameters (if many bindings of the outer scope are touched). Obviously, you can also simulate the binding capture with the Ref trick pointed above by @foobar_lv2, but this is done just because the compiler does not play nice with typing in closures; if this worked well it would be a much better syntax than needing to use [] everywhere.

If you wanted it to work like Python you would need:

julia> function outer(n)
           function inner()
               let n = n
                   n = n + 1
               end
           end
       
           inner()
           return n
       end
outer (generic function with 1 method)

julia> outer(5)
5
6 Likes

Thanks for the response!

Is this not contrary to what is stated in the documentation:

A new local scope is introduced by most code blocks (see above table for a complete list). Some programming languages require explicitly declaring new variables before using them. Explicit declaration works in Julia too: in any local scope, writing local x declares a new local variable in that scope, regardless of whether there is already a variable named x in an outer scope or not. Declaring each new local like this is somewhat verbose and tedious, however, so Julia, like many other languages, considers assignment to a new variable in a local scope to implicitly declare that variable as a new local.

https://docs.julialang.org/en/v1/manual/variables-and-scoping/

1 Like

It seems that explicitly declaring the variable as local results in expected behavior:

function outer()
    n1 = 1
    function inner()
        local n1 = 3
    end

    inner()
    return n1
end
        
println(outer())

Results in:

1

But that’s not a trick! Outside of closures, julia has very simple easy-to-understand semantics: Assignments are not an operation, they are a syntactic construct that gives a nice name to a computed value. Values are SSA, they do not have addresses and are immutable. They can point to an allocated object that can be modified via setfield!/ setindex! and the like.

Languages like C allow the addressof (&) operator. This is completely bonkers insane – it means that whether something is addressable/stack or register/SSA depends nonlocally on whether the address is taken later at some point. With C++ this gets even more unreadable, because the callee can accept a reference and write to it, and there is no syntactic indication of this at the callsite. (in a saner syntax, one would simply pass a pointer instead of pretending to pass a value).

The way julia closure semantics work, something that looks like a value is instead a Ref (ok, a Core.Box, but that’s pretty equivalent to a Ref), and lowering magically inserts all the x[] (like the magical hidden * added in C because the thing you declared as int actually is an int* because you took its address somewhere). This hurts readability, and is imo much harder to understand for beginners.

If you want something mutable across stackframes, then allocate it explicitly, instead of hiding it. I’m not arguing for better optimizations (ok, these are always nice), I’m arguing for capture-by-value closure semantics (hence julia 2.0 at the earliest).

6 Likes

Note that in your original post the assignment is: n = n + 1, as the n in the n + 1 cannot be the new binding/variable that was not even assigned for the first time yet, the compiler assumes you are referring to the n of the outer scope. And, this, for example, will fail:

julia> function outer(n)
           function inner()
               local n = n + 1
           end
       
           inner()
           return n
       end
outer (generic function with 1 method)

julia> outer(5)
ERROR: UndefVarError: n not defined
Stacktrace:
 [1] (::var"#inner#2")() at ./REPL[3]:3
 [2] outer(::Int64) at ./REPL[3]:6
 [3] top-level scope at REPL[4]:1

Because now the compiler is certain that you want a binding n that is local to the inner function, but then, the n in n + 1 is not yet defined.

If local is used then all behavior matches Python

function outer(n)
    function inner()
        local n = n + 1 
    end

    inner()
    return n
end
        
println(outer(3))

Results in:

ERROR: LoadError: UndefVarError: n not defined

Which is the same behavior as python. So the code failing does make sense.

Assignments may be operations in other circumstances, for example: every assignment to a global variable.

…
This is a completely valid and legitimate choice in the design of a language. I doubt the Julia founders think Ritchie was bonkers.

Most people I know consider references a legibility improvement over pointers. Also, this is exactly like Julia behaves for mutable objects, you pass mutable objects to a Julia function, you cannot be sure if they will not be mutated.

I do not really mind which are the semantics, as I do not use closures a lot (at least explicitly). But semantics are one thing and optimization is another, you could have the current semantics (inner functions assigning to bindings of outer function), and the compiler could have a way to deal with it that did not need to allocate anything. One thing does not obligatorily implies the other.

I’m only half tongue-in-cheek here.

Julia doesn’t need addressof: The type of an object (mutable vs immutable) describes whether it behaves like mutable shared data or like a value. In the relatively rare cases that one doesn’t like that, one can always wrap a mutable struct (like e.g. a Ref) around the immutable value. This is rare enough that being explicit is affordable, whereas C would become too verbose if the declaration and all uses already contained info about the possible max level of indirections. So supporting the totally-bonkers addressof construction is almost surely the right trade-of for C.

Likewise, capture-by-value would probably have been affordable for julia: One would gain a lot of explicitness, transparency, and language simplicity / elegance, at a very modest price in verbosity. This is especially true in today’s world, where the compiler is so bad at optimizing closures: High quality code is already effectively capture-by-value, simply because everything else is slow. Other languages like scala could not make this decision: In scala, there are closures mutating hidden shared state everywhere, and stackframe boundaries are very implicit (e.g. for-loops are closures in scala, and return doesn’t return from the current stackframe, it returns from the current lexical function; if you are in an anon-closure, then this is implemented via exception handler).

3 Likes

I feel that since this is marked as “the solution” to this thread I should comment that “never ever assign to a closed over variable” is very much not an official stance. It is totally fine to do so, although, yes, it can be a performance issue, so beware when you are writing performance critical code. Not all code is performance critical, however, and being able to modify variables that are closed over is the defining characteristic of a closure. Julia’s behavior here is the standard behavior that Lisp and its descendants have. We’re not doing something weird or questionable here. In fact, Python’s refusal to support closures has long been a big point of contention for people who wanted to do functional programming.

The main reason closures work the way they do is so that you can simulate control flow features like for loops and conditionals. Consider the foreach function. Here’s a for loop that accumulates a total, demonstrated with v = 3:7:

julia> let t = 0
           for x in v
               t += x
           end
           t
       end
25

Closures are expressly designed so that this works the same way:

julia> let t = 0
           foreach(v) do x
               t += x
           end
           t
       end
25

The design of closures is — not accidentally — exactly what’s needed to make these work the same for any loop body. If inner functions captured by value and didn’t allow modifying outer locals, then you couldn’t do this: at the end of the let block, t would still be zero.

There is a design discussion to be had about whether perhaps inner functions should capture by value instead of closing over bindings, but that’s far from standard — in fact, I can’t think of any language that does this and there’s no name for it — it’s definitely not a closure anymore. And as you can see, it means that you can no longer easily do control-flow-like things with anonymous functions. Writing these posts as though everyone agrees that’s the right thing to do is quite misleading. It may be a good idea, but it’s a very unexplored design space.

27 Likes

Any purely functional programming languages (e.g., Haskell) virtually have capture-by-value “closure”, right? Or rather, you can’t tell the difference since there’s no mutation/rebinding.

And I think these languages are important reference point. Purity is very important for optimization. That is to say, foldl(+, v; init = 0) is much more compiler friendly than let t = 0; foreach(x -> (t += x), v); end.

What about the continuation-passing style (CPS)? It lets you have control flow without requiring rebinding variables.

Also, I think FLoops.jl is one of the concrete examples in Julia that shows that you don’t need to be able to update outer scoped variables for using inner functions as a construct of a control-flow structure (but yeah, it requires some mechanism and I agree it’s not “easy”). Transducers.jl is another one since transducers are CPS.

2 Likes

Thank you for this reply! However, as I mentioned previously in this thread, I believe this is counter to what the documentation describes.

I opened an issue on the Julia github page to highlight this:
https://github.com/JuliaLang/julia/issues/40238

I don’t see what is wrong about that paragraph, but we can continue that discussion on the GitHub issue.

1 Like

If I may try to rephrase your points, they seems to be that (a) reassigning variables is unnecessary for writing programs and (b) maybe better to avoid. Is that fair? The former is unarguably true: it is possible to write all programs without updating any variables. I largely agree that this is often a good way to write programs. However, I don’t feel that it’s a good idea for a language like Julia to go so far as to forbid updating assignment. It’s just not the kind of language where we forbid that kind of thing and insist on more esoteric programming patterns for the sake of ideological purity. Moreover, you can already write programs this way in Julia, and if you do, then you won’t care whether inner functions capture variables by binding or value since, as you say, there’s no observable difference (and this will always be fast).

Neither of these points contradict what I was saying, which is that in the presence of the ability to update local variables, the current closure behavior is the natural choice that allows closures to emulate control flow. The existence of purely functional languages, in particular, does not seem like evidence that it’s a good idea for inner functions to capture locals by value in impure languages since in those languages there’s no way to tell the difference. What would be evidence for that being a good design is a language where you can rebind variables and inner functions capture them by value rather than binding. A pure language can’t argue either way since there’s no difference.

4 Likes

I like the idea of explicitly using Ref for capture-by-reference closures. Having them implicitly is a common source of type instability, and closure let blocks are the only hacky looking code I have to use regularly in Julia.

Yes, I think your reasoning is fair and does make sense (consistent). I think I know where the decision is coming from. It is a beautiful property and very useful that, e.g., open(...)-try-finally-close can be re-written as open(...) do.

But, I cannot help wondering if capture-by-value inner functions could have been one of those design decisions where Julia chose to step out from the conventional wisdom. It’s not like the design choices in Julia were all standard (if it were, it would have been a very boring language). Could it be a design choice that “does not make sense” at a face value (especially in otherwise an imperative language, as you said) that actually works well in the context that Julia is used?

When discussing this, I think analyzing functional language flavor in Julia is useful. For example, iterate is pure and it’s in a way surprising that Julia doesn’t use a more conventional mutation-based “iterator” approach. The decision to have functional iterate makes sense given how dynamism and JIT compilation interact; i.e., since type inference is not a language API, the user cannot pre-compute the mutable state type that some generic iterators (combinators) require. I think this mirrors the problem we have with closure. Proper closure would have worked much better if Julia were statically typed language or a “scripting” language that does not have a strict performance requirement. However, Julia is a dynamically typed language and the users love to write highly optimized programs. I think this is the source of a unique challenge for the semantics of the inner function/closure in Julia. I find it interesting that using a purely functional approach is a (potential) solution for the two independent problems due to the dynamic typing nature in Julia. (Edit: Ok, this could be just because I have a hammer (functional programming) and everything looks like a nail.)

Of course, it’s impossible to change how default closure works now in 1.x. Maybe opaque closure can eliminate performance gotchas (which would be great). But I feel an optional capture-by-value inner function would be an interesting API to explore.

6 Likes

That’s always the question, isn’t it? It’s true that we made a lot of unconventional choices in the language design, but I also think there’s a limited budget of those and you’ve got to really have conviction behind them. So Julia’s closure design is the bog standard Lisp behavior. It’s funny that so many people think we did something weird with this or scope when, in fact, we did the most conventional thing possible with both, following the Lisp tradition to a tee. Python is, in fact, the unconventional language here in that it strongly rejected closures and fine-grained scopes. Of course, now so many people have Python experience and come from that context. But historically Lisp and Scheme in particular is the gold standard dynamic language for this kind of thing. So what do you do when you don’t have a strong take on something like this? You follow the gold standard.

Of course, it’s impossible to change how default closure works now in 1.x. Maybe opaque closure can eliminate performance gotchas (which would be great). But I feel an optional capture-by-value inner function would be an interesting API to explore.

I was previously more excited about this, but I sat down and thought it through one day and convinced myself that it’s a bad idea for the default closure behavior with reasoning that I’ve pretty much already put down here. The reasoning is this:

  1. Let’s assume that we want let s = 0; for x in v; s += x; end; s and let s = 0; foreach(v) do x; s += x; end; s to work the same way.
  2. Let’s assume that we made capture-by-value the way inner functions work.

Those two assumptions dictate that the for loop version leaves s == 0 at the end, which I consider to be a reductio ad absurdum — we can’t have that, so we have to pick one of the assumptions to violate. So if we’re going to stick with the second assumption, we have to violate the first one. But that’s a really bitter pill to swallow because we wrap code in closures in Julia all the time. It is so common to throw an open(file) do block around some code — it’s a key idiom in the language. With this change, wrapping something in a closure would break things all the time. We are implicitly relying on the fact that whether code works the same way whether it’s wrapped in a closure or not all the time. To me that’s a QED that we can’t really have inner functions capture by value, at least not by default.

I still think it may be possible to further optimize https://github.com/JuliaLang/julia/issues/15276 — it’s already been improved a lot in that it’s much harder to trigger these days. A much milder breaking language change would be to force assignments in closures to be type-stable. That would allow type inference to ignore closures since would not be able to change the type of any local variables.

6 Likes

Btw, this line of reasoning is also why it’s not great to require outer x = blah or something like that to update an outer local x from an inner function: you are introducing a new inner function body when you do things like open(file) do and if there was an assignment to a local in the code you were wrapping, now it’s going to fail. Our current design is the design you need to make adding and removing do block constructs completely non-disruptive.

Consider the situation in Python for comparison: instead of having a general purpose feature like do blocks, Python has with open(file) as x: which is only for cleaning up at the end of the block, doesn’t introduce a new scope (that would break things), and can’t be based on closures because Python doesn’t have them. Some may prefer that approach, but it’s far less general and definitely un-Julian. The Julian approach is what you see: add a bit of simple syntax sugar — f(args...) do as syntax for passing a closure as the first arg to f — and then use it everywhere to implement many different kinds of idioms, from repeated evaluation with foreach to guarded resource allocation with open. The feature doesn’t care what you do with it, it just provides a nice syntax that’s translated to an existing basic language feature. But that approach is fundamentally enabled by the fact that code inside or outside of a closure works the same because that’s the way closures are designed.

3 Likes

It is here that I disagree. Why should both work the same way? Both have totally different stackframes!

What this should do in my fantasy capture-by-value world is throw an UndefVar-error: s is the left-hand-side of an assignment s = s + x, and by the scoping rules presumed local to the anon-function stackframe. Then +(s, x) fails because s is not defined yet.

Regarding clarity/transparency: An important thing is not just “given this source code, what should intuitively happen (semantically)”, but also “given this source code, what kind of machine code and data layout should be intuitively emitted”. On this front, julia is already miles ahead of e.g. scala or large parts of C++, and it is one of my favorite things about it.

If you value both kinds of simplicity to a similar degree, then closure-by-value would be an actual simplification to the language.

And again on this front, requiring outer blah = ... does not really help the issues I have: I really hate if the syntax adds hidden indirections and shared mutable state (the Ref / Code.Box wrapper) or hidden stackframes; why not make the user write this out explicitly? Thanks to Ref, this is only a very modest amount of verbosity.

I mention scala so often both because I have to write a lot of it for the last two years, but also because it is an interesting example: The scala people take the “should behave the same, irregardless of whether in a function or not” thing very seriously. Indeed, I consider scala the reductio ad absurdo of your assumptions. To the point that you never know in what stackframe a block of code executes, or what fields an object has (i.e. one needs to javap everything, and after knowing llvm-IR, java bytecode is just plan ugly).

This makes some things very idiomatic to write, but in e.g. fun(do_something()) you know nothing about behavior without looking up fun, not even whether do_something() executes at all, or when it does: It could be def fun(block: => Any):Unit = {}. I utterly hate it, because readability is more important than writeability.

1 Like