The manual's section on variable scope sucks

If there are n variables and m blocks, there are n + m of something scope-like. I just wish for language that distinguishes between them so we can talk about all n + m things. Maybe it’s variable-scope and block-scope.

I didn’t mention the proposal of := because it’s not happening until Julia 2.0. I wanted to see what we can do with what we have at hand, so documentation becomes important. I do agree that it would allow us to distinguish between declaration ( let x, local x, f(x) = ...), definition :=, and assignment =, which is at the source of a lot of confusion. But we should be prepared to see code that looks like f(x:=0) := nothing if we are going to use := consistently.

The difference is that := would be for all reassignments whereas outer is for only reassignment into other scopes. Either one would solve the main problem. Python does the outer thing (“nonlocal”).


begin end is so verbose that I wouldn’t want to use it – personally I think begin end should’ve been { } but nobody wants to hear that.

1 Like

I misread it then. So it’s = for the original definition and := optionally(?) for following reassignments, which doesn’t directly choose the outer versus local scope but would naturally fit because it’s more obvious a := versus = in a nested scope is outer versus local. Its static checks could involve more than a scope boundary.

On the other hand, I had imagined that explicit outer/local could evade the difference in default behavior between local and global scopes, allowing source code to behave the same between them, especially from macros intended to work in both. It didn’t really go anywhere because it was hard to imagine how metaprogramming could replace the default behavior with outer/local automatically when macro calls were involved.

1 Like

I am wondering how diffflicult would it be to write a utility package that takes a piece of code, parses it, the spits out an annotated version that shows then scope of each variable.

1 Like

Given the excellent outline by Tamas, that could be something what an AI assistant can probably produce. A nice case for a competition between different AI assistants?

1 Like

I am thinking about doing the PR myself if no one else is interested.

5 Likes

You mean a PR for the manual or a package? The manual I can help with. I don’t think I am skilled enough to work on such a package, but I can help with testing.

The language manual, of course.

1 Like

I think GitHub - c42f/JuliaLowering.jl: Julia code lowering with precise provenance will be the right tool for this but it’s not ready yet.

2 Likes

Working on a PR already, but I am wondering about the best way to include explanations within the code. Instead of just text explaining each code snippet, I think it would be great if examples also came with simple explanations as comments. (This was the practice for a lot of the discussions).

Currently I am leaning towards Unicode box characters, as in

function f(x)
    #      scope of let
    #     ╱   scope of function f
    #    ╱   ╱
    let x = x + 1
        # scope of let
        #╱
        x += 2
    end
    x
end

The reason for is that even if the variable is the first character in line, it can come after the # and still be well-aligned. Suggestions/comments are appreciated.

10 Likes

I find it easier to read when the # are all in the leftmost column:

function f(x)
#          scope of let
#         ╱   scope of function f
#        ╱   ╱
    let x = x + 1
#         scope of let
#        ╱
        x += 2
    end
    x
end
4 Likes

What follows is my draft on how I would organize a chapter, without the examples. Comments are welcome, especially if I got something wrong.

Preamble

This is all about lexical scope, ie the location of definitions matter, not the nesting of the stack.

Some terminology, and the easy part

A key concept is scope blocks, introduced by

  1. module, baremodule
  2. struct, for, while, try,
  3. macro, function, let, comprehensions, generators
  4. if, begin (yes, more on this later)

The easy part: what a variable refers to is determined like this. You keep checking the next outer block, including modules, until you find the variable in the scope of the block (implicitly, what you type in the REPL is in Main), or error. Notably,

  1. this does not cross submodule barriers, when you reach the innermost module (which is called the “toplevel”) you are done,

  2. order of evaluation does not matter within the scope block, if a variable belongs to a scope block, it does so for the entire block (but may not be assigned, just declared — this is about scope, not values).

In this, Julia is like all languages descended from Algol 60. Nothing to see here.

For historical reasons, variables at the toplevel are called global, while the rest are called local. Even though all of them are within blocks, either implicitly or explictly. Deal with it.

The tricky part

The tricky part: how variables are introduced to a scope blocks.

Directly

Some blocks just name variables, which are introduced into the scope of the block. Notably,

Each function argument is in the scope for subsequent arguments and the function body:

function f(x,
           y = x; # x is in scope
           z = y) # x and y are in scope
    # x, y, z are in scope
end

and the same holds for do blocks, macro, and, needless to say, short forms like f(x) = ....

Comprehensions and generators:

[scope_of_x for x in ...]
(scope_of_x for x in ...)

for blocks for variables immediately following the keyword:

for x in ...
   # scope of x
end

And finally let, the explicit purpose of which is to introduce a block with scope for a list of variables, each of which is in scope for the subsequent ones:

let x = 1, y = x
    # scope of x and y
end

Note that you separate with commas, and it only affects variables before the line break.

By assignment

If you assign (x = ...) within a scoping block, the following happens:

  1. For function, do blocks, the body of comprehensions & generators, and macro, a new variable is introduced. This is “hard local scope”.

  2. For for, while, try, and struct, the rules of “soft local scope” apply. Specifically,

    1. If there is no variable with the same name at the toplevel (a “global”), a new variable is introduced locally, with scope within this block.

    2. If there is such a variable at the toplevel,

      1. in interactive contexts (notebook, REPL) that variable is assigned to and no local variable is created, while

      2. in non-interactive contexts a local variable with scope within this block is created.

  3. For module and baremodule, assignment creates a global variable at the toplevel (this covers const and function declaration, too).

  4. For if and begin, the scope will be the innermost enclosing block that is not if or begin.

Note that destructuring may count as assignment (eg when within a function):

(; x, y) = ...

or directly introduce variables:

function f((x, y),)
    ...
end

In any case, the rules above apply.

By declaration

global x will declare the variable in the enclosing module at the toplevel, no matter where you use it. And you can use it everywhere.

local x will declare the variable to have scope in the enclosing block. It is rarely used for function and friends, because that is the default for hard local scope. But you can use it to override the behavior above for soft local scope. You cannot use it in struct, macro, module, baremodule, but you can use it for if and begin at the toplevel.

Note that either of these constructs followed by assignment (eg global x = 3) is a shorthand for declaring a variable with the given scope, the assigning to that variable.

9 Likes

Wouldn’t this be the outermost?

1 Like

Great draft :slight_smile:
Here are some (hopefully helpful) comments:

I believe this is wrong or at least worded in a way that reads wrong to me. Considering your description I would expect this function to print 1 but it prints 5

function test()
    x = 1
    map(1:5) do y
        x = y
    end
    println(x)
end

Without the surrounding function block you would be correct. The rules you describe here are valid only when the enclosing scope is a global scope.

Perhaps I didn’t understand what precisly you mean by “within a scoping block”.

EDIT: Thinking some more about this: Did you mean the argument names are always local to the function body shadowing outer bindings even if local? That is correct indeed, i.e. this prints 1

function test()
    x = 1
    map(1:5) do x # changed this to shadow the function's x
        
    end
    println(x)
end

I’ll leave in the rest below anyways.

My mental model is the following (might be inaccurate):
Julia has 2 types of scopes: global and local. Global scope is basically synonymous with “toplevel”. Local scopes can be inside each other with a single global scope on the outermost level.
The difference in behavior between “soft” and “hard” local scopes for variable assignment depends solely on the enclosing scope the name is found in:

  • If the enclosing scope is local there is no difference. No new binding is created.
  • If the variable name is found at toplevel/global scope, then a “soft” scope reuses that binding, while a “hard” local scope will introduce a new binding.
    So this section about assignments might need some corrections/rephrasing for clarity.

Not an error but earlier

you talk about that the “variable belongs to […] the entire scope block”. This is true but feels very unintuitive to me (perhaps just because it did a lot of Python before Julia). So would accompany this with an example that showcases this. We can reuse my example function from above:

function test_thisworks()
    map(1:5) do y
        x = y
    end
    println(x)
    x = 1 # assingment means x belongs to function scope
end

function test_thiserrors()
    map(1:5) do y
        x = y
    end
    println(x)
    #### x = 1 # no assignment, x is only local to closure
end
1 Like

I applaud you tackling this and appreciate your attempt to categorize and group the behavior better. Now some critques.


In general, I think you’ve brought the weedy details up earlier than indicated in your first post. For example, instead of branching if’s of text here. I would rather the documentation just explain the simple, 80%-of-the-time case up front and then throw on some asterisks which link further down. The linked sections can then explain each exception and why it is necessary.

I am the author of the two admonitions currently at the top of the scope documentation which I added to answer my questions:

  1. Isn’t there a simple rule to explain all this if I don’t do anything weird?
  2. Why all this technical detail? Just tell me why my loop isn’t working.

To the last point, I’m not sure any of your current text explains the nuance required for while loops. You only really get an error if you try to access and declare. If you do only one or the other, it will run.

i = 0
while i<5
    # alone, runs and prints 0
    println(i)

    # alone, runs and prints 10
    i = 10
    println(i)

    # both together error.
end

I don’t think this is true because the below errors with nothing in global scope. It tries to re-use the outer (non-global) x.

let
    x::String = "hello"
    for i in 1:1
        x = 3
        println(x)
    end
end
1 Like

I think it just needs to be clarified that soft-scope is active at the global level. So it only applies to a global for loop or whatever (and struct should probably be dealt with separately).

Within an already existing local scope, soft-scoped constructs like for, while, try, etc. introduce no new scope. It’s only at the global level that they do this sorta-half-not-really-locally-scoped behaviour.

1 Like

Is this actually how it the implementation works? Or is this just a mental model you’re proposing?

It would be nice if 80% of common confusions about scope were addressed in the beginning of the chapter. But to me, this is secondary to accuracy and completeness. And I don’t know if we can achieve accuracy and completeness without better language.

@Tamas_Papp, you’ve written:

If you assign (x = …) within a scoping block… For function…a new variable is introduced.

One way to “unit test” the manual is to find counterexamples. So consider this counterexample to the above statement:

function f()
	x = 0
	x = 1
	return nothing
end

The x = 1 in the example does not create a new variable, it assigns to an existing one.

This ambiguity illustrates why I think we should distinguish between declaration and assignment from the very beginning. When you say variables get assigned “directly” (f(x) = ...), there is still declaration happening there, is there not? When you say x = ... is assignment, there might be declaration happening there too, as is the case with x = 0 in the example above. And You haven’t mentioned let blocks in your “By assignment” section. Those can get tricky.

I’ve said this already, but I similarly think that we should not mention hard vs soft scope, at least not till the end of the chapter. The distinction is just a matter of the default behavior of the REPL. While I agree that many people probably run into this early on, emphasizing the distinction creates an unnecessary cognitive load.

If I see a piece of Julia code, I want to be able to pull up the manual next to it, and make sense of the code. Maybe that means the best way to explain variable scope is through a flowchart. And I might find it annoying to go through 15 branches of yes/no to understand a variable’s scope, but I would live with it if it gave me the right answer.

1 Like

No, going from inside out, it is the first module (ie the narrowest submodule). Eg

julia> module Foo1
       x = 1
       module Foo2
       x = 2
       f() = x
       end
       end
Main.Foo1

julia> Foo1.Foo2.f()
2

Yes.