The manual's section on variable scope sucks

I know it’s not for the lack of trying, but the manual’s section on scope is unclear if not outright wrong.

Issue 1
The manual says:

If you assign to an existing local, it always [emphasis not mine] updates that existing local: you can only shadow a local by explicitly declaring a new local in a nested scope with the local keyword.

a few lines below, it says:

When x = occurs in a local scope, Julia applies the following rules to decide what the expression means based on where the assignment expression occurs and what x already refers to at that location:

Existing local: If x is already a local variable, then the existing local x is assigned;

There are known exceptions to this:

module M
    local x
    let
        x = 0
    end
    @assert x == 0 # errors. `x` is not defined
end

The exception is explained in a different section under the title “Global Scope”:

If a top-level expression contains a variable declaration with keyword local, then that variable is not accessible outside that expression.

Issue 2:

The manual says:

A scope nested inside another scope can “see” variables in all the outer scopes in which it is contained.

This is incorrect:

module A
    x = 1
    module B
        global x
        @assert !(@isdefined x)
    end
end

Issue 3:

On soft scope, the manual says:

If x is not already a local variable and all of the scope constructs containing the assignment are soft scopes (loops, try/catch blocks, or struct blocks)… [and] if global x is defined… in interactive contexts (REPL, notebooks), the global variable x is assigned.

First, the condition “all of the scope constructs containing the assignment are soft scopes” can never hold because all scope constructs are always in some global scope. For instance, expressions evaluated in REPL are evaluated in the Main module. But excusing this ambiguity, the instruction still does not hold. Evaluating the following in a REPL, the, the global x is never defined:

module M
    global x
    for i in 0
        x = i
    end
    @assert !(@isdefined x)
end

Issue 4:

The hard-vs-soft scope is a distinction relevant only to certain blocks (such as for ... end) in the REPL. And it’s been felt before that the distinction has received undue emphasis Local scope rules are confusing in inner scopes (Julia 1.6) - #2 by jeff.bezanson . Yet, the manual pretty much starts off with hard-vs-soft scope.

Issue 5:
The manual is titled “Scope of Variables”, but it contains statements like:

the for loop body has its own scope

It is not always consistent about whether scope is the property of a variable or that of code blocks.

Issue 6:
The maual says:

Whereas assignments might reassign a new value to an existing value location, let always creates a new location.

This is not true:

let x
    x = 0
   let
       x = 1 # does not create a new variable
   end
    @assert x == 1
end

Issue 7:
Is struct-end really soft scope?

global x
for i in 1:1
    if true
        x = 0
    end
end
@assert x == 0
struct A
    if true
        x = 1
    end
end
@assert x == 1         # errors

I think a better approach might be get users thinking about where in the code a variable is implicitly or explicitly declared, and to point out the distinction between variable declaration, variable assignment, and variable definition (declaration followed by assignment).

We may also want to draw a distinction between a variable’s scope and the block constructs (such as let ... end blocks) that define the boundaries of variable scope. This gives us better language, and helps us avoid statements like “a global variable is accessible anywhere inside a global scope”, which is just not true:

module Outer
	x = 0
	module Inner
		global x
		@assert !(@isdefined x)
	end
end

Here, the scope of the variable x is not the entirity of the Outer module.

The rest is just about global v local scope. Soft-scope can be explained as the exception, with motivation, once the rest of the content is covered.

The manual does contain a lot of important information including examples and the rational behind the soft-scope behavior, which I think should be kept. But I nevertheless started this shorter writeup which might serve as a starting point:

I was hoping folks here would review it, propose more examples, and point out any mistakes.

6 Likes

As you seem to be aware, this manual section has been rewritten a couple times and none of these revisions have ever made everybody happy.

Many of the problems you mention come down to module, which does not use lexical scope at all, but rather each module introduces a new global scope that can only share identifiers with others via import/using. Every direct subexpression of a module block is a top-level expression. So for example

module A
local x
end

is a degenerate case where the local x only exists within the expression local x itself. A module cannot have local variables of its own, it only has global variables.

The issue with let is that it only introduces new variables listed right next to the keyword, i.e. with let x = 0. If you do

let
    x = 0
end

you have a let that introduces a scope with no new variables and the rest is just the body.

Correct, I have never liked the idea of “hard” vs. “soft” scopes, because it makes it sound like we have these two kinds of scopes all over the place. In fact all there are is global and local variables, plus some slightly complex rules about what happens at the top level.

When you evaluate a module expression in the REPL, the inside of that module is no longer in the REPL context. I’m not sure how exactly we should clarify that. But the most important point here is that module is not a normal language construct; it’s a very special thing that introduces a whole new top-level entity, it is not just part of program structure like for or let. It would be better in some ways if nesting modules were not allowed, but that would just not be very practical for code organization.

At a high level, the reason all of this works is that 99% of programming is inside functions, where everything is pretty simple. Most code never does or should assign to global variables.

23 Likes

Could the documentation be improved by providing all combinations of the simplest examples in comparing different global/local variables and their interactions with modules and other blocks? It could have println or @show statements and error results to visibly demonstrate the interaction between scopes. And demonstrate any differences when in the REPL.

Or are the combinatorics on that too proliferative?

I don’t like the section either, but some feedback by issue:

  1. I think you’re pasting document text into triple-tick ``` blocks. The absence of newlines is formatting paragraphs as long lines with long sliders, use a > block instead.

  2. Since the local x does not exist after its expression, then there isn’t an existing local in the global scope of M by the time the let block runs. This isn’t an exception, the rules are accurate.

  3. In the second paragraph of the page, it states “There are two main types of scopes in Julia, global scope and local scope. The latter can be nested.” I interpret this to mean that while you can write module blocks inside each other, the global scopes are not nested in each other. It’s consistent with the first line of the global scope section “Each module introduces a new global scope, separate from the global scope of all other modules—there is no all-encompassing global scope.” I do believe it could be clearer.

  4. Your example with global x does not define x, as you say. So, the rule “if global x is undefined, a new local named x is created in the scope of the assignment” is correct, you cited the wrong rule.

  5. Soft scope needs to be explained from the beginning because of its diverging behavior between interactive sessions and evaluated source code. People naturally write and run code as they learn, and it’s not fair to leave out rules they’ll easily run into. Delaying the explanation for why this complication exists is the best we can hope for, and that’s what the “On soft scope” section is for. For the record, I do wish soft scope didn’t exist, as I can easily understand that distant files make reassigning forgotten globals a problem, and I’d prefer a debugger over pasting locally scoped code in persistant global scopes. But that’s just not how Julia v1 developed.

  6. First sentence is “The scope of a variable is the region of code within which a variable is accessible,” which is clear to me. I wouldn’t be confused by whether a favorite ice cream flavor of mine is strictly a property of me or the ice cream, I know how I and the ice cream are involved.

  7. In context, it’s talking about the variables in the let header, not all local variables in its scope. I agree it’s needlessly vague.

  8. Never thought about evaluating code other than function blocks in struct blocks, and I can reproduce this.

I do actually think like this, but it’s only by reducing the rules to a very short version centered around accesses vs assignments, so soft scope comes up fairly quickly in my mind. There’s too much implied knowledge (like what variables or scopes even are) to replace the comprehensive scoping rules.

2 Likes

@jeff.bezanson says:

module is not a normal language construct; it’s a very special thing that introduces a whole new top-level entity, it is not just part of program structure like for or let

I can appreciate this to some degree, but at the same time, if this means we cannot really have the notion of global scope, I don’t know how to talk about it or document the behavior of code. The important thing is that, when users want to know how their code behaves, they can refer to the documentation and know exactly what their code will do. I am proceeding, for now, with the assumption that talking about global and local scope will still make that possible.

I did see a comment in some discourse thread that suggested that thinking about scope is not too bad if you think of each expression as looking up what variables are available to it (or something of that sort). If I knew how that worked under the hood, I think it could be a good place to start. But for now, in the github gist I shared, I have taken the following approach:

  • I distinguish between variable declaration, assignment, and definition (declaration followed by assignment). I came to conclude that when the manual taks about a variable being defined, it really is just taking about variables that are declared. But maybe in some cases when it says defined, it does actually mean declared and assigned, like in issue 3 (as @Benny pointed out). I’ve found it helpful to keep the notion of variable declaration in mind. Now, I know that let x end and function f(x) end declare a local variable x, but let end and function f() end do not. It helps me reason better about any x inside the function or let blocks.

  • I talk about soft scope as an exception to the rule. I don’t deny, @Benny, that soft scope is important. I am saying that it would be less confusing to think about it as somewhat of an exception in interactive contexts. If I remember correctly, it is also just the default behavior in the REPL, and can be changed.

  • I use the term “scope” as if it is a property of the variable, and call the different blocks “locally-scoped” and “globally-scoped” blocks. I think it is confusing to say “a let block has scope”. @Benny , to use your analogy, I want to distinguish between taste and flavor: “Icecream has flavor. Benny has taste.” So the interaction is between flavor and taste. Right now, I feel like the manual is saying “Icecream has flavor. Benny has flavor.” or “Icecream has taste. Benny has taste.”

  • @kapple , I collected a bunch of examples from discourse to showcase some complex cases. These can be thought of as exercises. The idea is that if you can think through these examples and figure out how the code behaves, you can be more confident in your understanding of scope. I still don’t know what to do with issue 7 though.

It would be good to have more experienced eyes review the gist. I will leave it up, but I wouldn’t want people to use it as reference if it’s not accurate. If people find this framing helpful, we can think about revising the manual. Or just have it around as an alternative.

2 Likes

It’s confusing. The very first mention of “declaration” is the example local x = 0. Later you see “writing local x declares a new local variable in that scope”. Finally you see “Multiple variables can be declared in a single const statement: const a, b = 1, 2”. To any reasonable reader, that looks a lot like the examples of “definitions” except you might omit the assignment for a moment. If there’s a clear distinction between declaring and defining a variable, I’m not seeing it on that page. I had to find out through @isdefined experiments, hence knowing that global x and local x statements alone do not define x.

English is ambiguous, the context is not. It’s pretty clear that module/baremodule introduce global scopes and most other blocks introduce local scopes. That obviously seems to be a property of the blocks. On the other hand, a local/global variable belongs to one entire local/global scope, and the same name in a different scope is a different variable. The home scope is a property of the variable. People don’t often fail to distinguish the meanings of a block or a variable “having local scope”, and trying to starkly divide the phrasing from “locally scoped” is exactly the source of confusion. In fact, variables are routinely said to be locally or globally scoped in all sorts of languages, including Julia.

I share your frustration about that part of the manual. If, after digging into the issues, you think you could do better, I would encourage an editorial PR that restructures the chapter from the ground up.

This is how I would start it:

  1. ditch the table in the beginning, it refers to concepts introduced later

  2. explain hard local scope without the frills, that is, just the constructs that introduce it and how it works, with examples. do not mention global at this point. This section is just about plain vanilla hard local scope, with examples using function, let (mention the linebreak issue), and friends. Suggest that the user writes code that is within functions as good practice. If necessary, introduce a new CSS style that shows this as blinking red text with fireworks.

  3. explain global scope and how it is tied to modules (defaulting to Main in the REPL), again avoiding mentioning local at this point. mention typed globals and const in this section, and warn the user away from using globals. Yes, it is occasionally needed, but rarely needed.

  4. now it is time to explain soft local scope, with a brief historical note on why it was needed, not longer than two sentences. At this point the reader has a grasp of hard local and global scopes, so they are in the position to grasp the soft local scope. Again, no local or global keywords up to this point, everything is about the defaults.

  5. the moment you have all been waiting for: introduce how you can switch the defaults using local and global. Clarify what kind of local scope (soft or hard) you end up in when you use local. Tell the reader that these are needed very, very rarely — you can write 10^n for n \ge 4 LOC of perfectly idiomatic Julia as a user without using either of these keywords. That said, present realistic examples after, taken from existing Julia code, eg simplified from Base or the standard libraries.

  6. a section of various simple corner cases with examples, and some historical notes. Emphasize the purpose of the soft scope, the related error message, how to deal with it. Mention that the module docstring is evaluated within the module scope, refer to the modules section.

  7. a section on various brain teasers from that chapter. emphasize that idiomatic Julia code should be written in a way so that the user does not have to think about these most of the time. Conversely, if the experienced reader finds that they are wondering about scope, it is very likely that the code they are reading could benefit from refactoring. So this part should be skipped on a first reading. But hey, you asked for it, so here are some corner cases.

Some editorial suggestions:

  1. never use “local scope” in text without the qualifier (“soft” or “hard”). If the text applies to both, write it out (“soft or hard local scope”).

  2. focus on idiomatic usage, not corner cases. Users who read up to 50% of this chapter (4. above) should just know that weird corner cases should be avoided, and 99% of the time you can just write code and enjoy programming Julia. Yes, discuss corner cases, but now the whole chapter reads like a Dan Brown novel, with hints about arcane details that could turn out to be significant later on.

  3. clarify terminology like “after”/“before” (shows up eg in the soft scope section) before it is used. Does it refer to textual order, evaluation order, or what? Maybe use an entirely different phrase specific to that.

Note that this is a major undertaking for which you have to do research. I would estimate at least 30 person-hours for even an experienced Julia user. But it seems you invested effort into understanding scope so you are off to a good start.

22 Likes

Ok, so I’ve programmed Julia for 5 years and have never really investigated the difference here

let x =1
   y=2
...
end

let 
   x=1
   y=2
   ...
end

In the second case, what is the scope of the variables?

It depends on the context in which your code appears:

let
    let x = 1
        y = 2
    end
    let 
        x = 3
        y = 4
    end
    @assert !(@isdefined x)
    @assert !(@isdefined y)
end
let y
    let x = 1
        y = 2
    end
    let 
        x = 3
        y = 4
    end
    @assert !(@isdefined x)
    @assert y == 4
end

@Benny, the reason I started to distinguish between a variable’s scope and a block’s scope, if I am to use the word “scope” for both, is the following (and maybe @dlakelan was getting at the same confusion):

let x
    let
        x = 0
    end
    let x
        x = 0
    end
end

How many scopes are there? There are three let blocks. But I would say there are two variables and hence two scopes. You also can’t really say that the scope x declared in the outermost let block is the entirity of that block because, inside the second inner let block, x refers to a different variable. But if there’s better ways to think and talk about it, I welcome them.

@Tamas_Papp, thanks for your suggestions. It’s good to see how someone experienced in Julia would redo the manual section. It’s also interesting to see that even experienced people differ in what aspects of variable scope they think are worth emphasizing.

Soft scope is specific to the REPL that can be turned off and is there just so the REPL can mimic behavior inside functions to suite the workflow of some people. From that perspective, I and others think it should be treated as an exception. On the other hand, it is possible that a lot of people run into soft scope. So maybe it deserves more emphasis in the manual. Short of well-executed user surveys, I don’t know how to empirically resolve the dispute.

This and other issues were discussed at length before I joined the community, I’m sure.

I think that, for now, I remain convinced that introducing the notion of variable declaration is the way to go. Just being able to read code and understanding what variable is getting declared when has helped me understand scope. If I see some x and don’t know it’s scope, I just try to figure out where it was declared.

I similarly think, for now, that local scope should be talked about generically before distinguishing between hard and soft scope.

I would also not have the manual section on scope recommend people write code inside functions. Doing so is good practice for several reasons, only one of which might be that it helps reason about scope better.

While I would he happy to work on rewriting the manual section, we can already see there are a lot of stylistic questions that are raised. So, for now, I will stick to updating the github gist page. Maybe if there are significant updates I’ll post here. And if it ends up being useful, we can use it as a basis for the manual section.

Again, you’re trying to force the concept of a scope entirely on the variable. The documentation never says that and explicitly quashes any such notion:

When we say that a variable “exists” in a given [local] scope, this means that a variable by that name exists in any of the scopes that the current scope is nested inside of, including the current one.

I don’t like the wording because “any” could be misconstrued as “all” instead of originating in a particular outer scope, but it’s clear that a variable can be accessed and assigned in multiple scopes, in other words a variable does not determine a scope. It’s easy to imagine that a person has 1 hometown but has lived in multiple states, it makes no sense to interpret “an apartment of theirs” to mean the one particular apartment is an inherent property of them. As the documentation states, that first nested let block introduced its own local scope, and it could access the outer scope’s x; it’s just wrong to assert that the absence of its own variables means there’s no new scope there.

I don’t think there was ever a choice between tying scope to blocks versus variables; it was a typical expectation long before Julia was developed. But if there was one, it’s very easy to see that a variable-identified scope makes little sense:

                 # variables and existence lines, omit end
let x = 0        # x
  y = 0          # | y
  println(x,y)   # | |
  let y = 1      # |   y
    println(x,y) # |   |
  end            
  x, y           # | |
end              

Do 3 variables make 3 scopes total? That doesn’t seem right, I wouldn’t say defining a dozen variables in one block makes a dozen scopes that overlap perfectly. Or are the scopes only separate if they don’t overlap perfectly, like this? Or does x share a scope with the first y to make 2 scopes total for 2 blocks? There’s no right or wrong answer there, only weird ones. Isn’t it much simpler to say each let block introduced its own scope, x was defined in the outer scope and is accessible in the inner scope, y was defined in the outer scope, and another y was defined in the inner scope and shadows the outer one?

3 Likes

Just so I understand what’s going on… What’s the scope of x in:

function foo(a)
   b = a+1
   let 
       x = 2
      println("b+x = $(b + x)")
   end
   println(@isdefined(x))
   b
end

what I get is that x isn’t defined and x appears to act as if it had just the “let” as the scope. if I try to return b+x it throws an error.

function foo ... end introduced a local scope, let ... end introduced a nested local scope. x was defined (assigned and implicitly declared) in the nested local scope and will only be accessible there. b was defined in the function scope and is accessible in that scope and in any nested scopes by default.

Yes, this is what I thought. So then do I have this right:

let x = 0

end

Guarantees that x is scoped in the let, regardless of if it exists or not in surrounding scope.

let 
x = 0
...
end

Here if x exists in surrounding scope the x refers to the surrounding scope, otherwise it’s a new variable with the let as its scope?

You are correct, but to avoid confusion down the line I suggest clearly separating different variables that happen to share a name instead of saying “regardless of if it exists or not in surrounding scope”. Instead, say that let x=0 declares and assigns (you need not assign immediately) a new local x in its scope. It has almost the same effect as:

let
  local x
  x = 0
end

except the perk of the let header is the right sides of the assignments access the outer scope if a variable hasn’t been declared yet or is mid-declaration, letting you define a new local variable with the help of an outer variable of the same name. I prefer different names for different variables especially when they’re so close in the source, but it could save the effort of renaming a bunch of variables when refactoring interactive code into more nested scopes.

let
# outer
  x = 0
  println(x) # 0
  #   new outer    new
  #   V   V        V
  let x = x+1, y = x
    println(x, y) # 11
  end
  println(x) # 0
end

If you move that x = x+1 inside the block, both x are in the same scope and must be the same variable, whether it originates in the outer (default) or the nested scope.

@dlakelan , here’s one thing to keep in mind with your example:

let
    let x = 0
    end
    let
        y = 0
    end
    @assert !(@isdefined x)
    @assert y == 0
    y = 1 # the assertion above fails if you comment this out
end

Here, the expression y = 1 means that y is defined in the outer scope, even though the line of code appears after the blocks.

I don’t understand why you need to use local here. Why not just say let x? Are the two equivalent or is there a difference?

The implicit reassignment thing (especially in let but also in functions) has really bothered me for a while. My opinions on how to improve this are to modify the language rather than the docs. It’ll be easier to explain in the docs if the actual rules are simpler.

Didn’t @c42f have a nice demo where different “identities” of the same variable name in a block of code were highlighted? Maybe something like that would be good for the docs, show a couple of cases with this highlighting to demonstrate what scope certain variables belong to, when an inner variable shadows an outer, etc.

1 Like

You’re right, it’s redundant in that example, the explicit statement is just illustrating the guaranteed declaration of a new local that let x does as well. It would matter if there were an outer local x.

Early in learning the language, I asked why the rules weren’t just simplified to entirely defaulting to assigning new locals. It became pretty apparent how much of a chore it was, that it was unnecessary for local scopes compared to global scopes split across files, and that it was a no-go to a user base that ended up reintroducing soft scope in part because writing global x was too much. However, I did suggest that the for-loop-exclusive outer keyword could be used to also explicitly designate reassignment of outer local variables. I think that would mirror local x better than x :=. Likewise, I’d prefer reusing familiar syntax over a new keyword delineating let block bindings (commas do work for multiple lines, but a newline still makes a difference, which could make an even more unsavory example of similar but different code), but I don’t have a clear suggestion (begin usually contains more than just assignments, parentheses with no intervening whitespace like a function call is clearer but both are still changed by a newline).

1 Like