Why not an even-harder scope?

Having a lot of time thinking about it, I really believe that our current design is very close to optimal given just these assumptions:

  1. Locals may be implicitly declared by assignment
  2. The rules for all local scopes are the same
  3. The language has closures.

There are languages where all variables have to be explicitly declared, i.e. you have to write var x or local x to declare a new local. This, of course, makes all problems relating to what scope an assignment assigns to go away: it’s the innermost enclosing scope that declares the variable and if there’s none, it’s an error. I would note that you can already program Julia this way if you want to: just declare every local with local x and the language will behave just like this. There are only two issues: you won’t get an error if there is no declaration of a variable and you have to declare a global reference before assigning it from a local scope.

The second assumption precludes rules like the outer one proposed above—we’ve already discussed why it’s desirable to go back and forth between a closure and other scoped constructs like loop bodies, so I won’t repeat that here.

Which brings us to the question: why should loops have scope in the first place? After all, loops don’t have scope in Python. The answer is that Julia, unlike Python, has always had closures and encourages using them. The connection is probably not immediately obvious, but pretty much every time you want to know why some scope thing works the way it does, the answer is “closures”. So let’s work this through step by step. If a language has closures, you can do things like this:

julia> fns = []
Any[]

julia> for i = 1:5
           push!(fns, () -> i)
       end

julia> [f() for f in fns]
5-element Vector{Int64}:
 1
 2
 3
 4
 5

There are variations on this like using a comprehension instead:

fns = [() -> i for i = 1:5]

But the core issue is the same: you want i to be local to the for loop or comprehension body so that when it gets captured, each closure gets a separate i instead of all of the functions capturing the same i. Suppose we did what Python does and loops didn’t introduce scope at all. We can simulate that here by capturing a global i:

julia> fns = []
Any[]

julia> for ii = 1:5
           global i = ii
           push!(fns, () -> i)
       end

julia> [f() for f in fns]
5-element Vector{Int64}:
 5
 5
 5
 5
 5

Oops. It’s actually even worse than this example suggests since not only is it the same i that is captured in each loop iteration, but it’s global so if it gets assigned after the loop then things go horribly wrong:

julia> fns = Function[]
Function[]

julia> for ii = 1:5
           global i = ii
           push!(fns, () -> i)
       end

julia> i = "oops!"
"oops!"

julia> [f() for f in fns]
5-element Vector{String}:
 "oops!"
 "oops!"
 "oops!"
 "oops!"
 "oops!"

Oops, indeed.

So, if you have closures, loops really must have scope. Python has traditionally dealt with this by refusing to have closures, much to the consternation off everyone who wants to do functional programming in Python. But at some point Python conceded that closures were quite useful and added lambdas, which leads to this unfortunate brokenness in the presence of Python’s scope rules:

>>> fns = [lambda: i for i in range(0,5)]
>>> [f() for f in fns]
[4, 4, 4, 4, 4]

Not great, Bob. And in Python 2, the i is global so it’s really bad:

>>> fns = [lambda: i for i in range(0,5)]
>>> i = "oops!"
>>> [f() for f in fns]
['oops!', 'oops!', 'oops!', 'oops!', 'oops!']

In Python 3 they fixed this by making i local to the comprehension (although local to the whole thing, not a new local for each iteration, so you still capture the same local i five times). And even in Python 3 you still have this trap:

>>> fns = []
>>> for i in range(0,5):
...     fns.append(lambda: i)
...
>>> i = "oops!"
>>> [f() for f in fns]
['oops!', 'oops!', 'oops!', 'oops!', 'oops!']

Because while comprehensions now introduce a local scope, loops still don’t so that i loop variable is not only shared by all the closures, but it leaks out into the global scope where you can clobber it and screw up all of those closures. While it’s good that comprehensions have their own scope, now you have the issue that going back and forth between using a loop and a comprehension is subtly different and may cause bad bugs.

Anyway, this isn’t to pick on Python, but to point out that they don’t have it figured out—scope is a pretty huge footgun in Python. And, more importantly for Julia, to show why a language that has closures really needs to have pretty granular scopes, in particular including loops and comprehensions.

The only thing that might be appealing to change in Julia 2.0 would be to make behavior in files to match the REPL but still print a warning in the case where an implicit local shadows a global of the same name. That would essentially bring back full 0.6 rules with the addition of a warning in the case where a local implicitly shadows a global—which is precisely the case where 1.0 rules differ from 0.6 rules. That way the behavior would be the same everywhere (albeit with the more complex 0.6 rules), with an extra warning in an ambiguous case in a file while the REPL would be slightly more lenient and allow you to assign a global from a loop without needing to declare the assignment to be global to avoid the warning.

25 Likes