Why is global scope different?

I just read through the Scope of Variables manual page and added a summary to the top of it. The main thing I still don’t understand is why global scope needs to be treated so differently from local scopes.

The rationale for the design decision is explained in this section with the following example.

We found that code like the following often occurs in the wild:

x = 123

# much later
# maybe in a different file

for i = 1:10
    x = "hello"
    println(x)
end

# much later
# maybe in yet another file
# or maybe back in the first one where `x = 123`

y = x + 234

But as long as global scopes are wrapped in module, I don’t get how the above issue is any more likely to occur in global scope than it is to occur in the local scope of some main() function.

The scoping rules would certainly be a lot easier to explain if their was no distinction between global and local. I get that global variables would run slower, but that doesn’t seem like a good reason to error out. (Also we have typed globals now.)

Maybe the reason I don’t understand this decision is because I don’t understand how global variables are actually used. I have been told to avoid using globals my entire Julia journey, and the hard/soft rules seem like a lot of fuss and confusion for features that we are supposed to avoid using.

What is the common scenario (that occurs with global variables but not local variables) I am missing that justifies the existence of the distinction?

1 Like

I’m not trying to reopen the whole can of worms here. I know it was discussed and refined at length.

I’m just saying my read-through of the documentation didn’t convince me why interactive soft scope everywhere all the time would be bad. I’m hoping someone here can show me a good counter example:

  1. that is likely to occur,
  2. that is worse somehow in global scope than in an outer local scope.

Then maybe I can clarify this point in the docs.

1 Like

This is how “I” convinced me: Scope of loops · JuliaNotes.jl

Sounds like your take is that it is just about disallowing low performance code in scripts. But then why are notebooks allowed to run slow code without complaint??

My understanding is more of “interactive” vs. “non-interactive” runs. In interactive runs we don’t want to be bothered by these scoping problems. In non-interactive runs, where things can be thought to take a long time to run, it is better to warn users of these possible issues.

1 Like

It could print a warning that the structure is suboptimal without halting execution.

Structs and functions with their method tables are global variables. The global variables determine how methods can be compiled and types inferred. Eval, for instance must always run in the global scope of the module.

I believe (and I am not sure!) that if any function could modify the global name space without the compiler being able to isolate that, then the compiler would be under enormous stress to revalidate everything all the time. Limiting global access seems to be one of the ingredients that makes the magic of Julia possible.

Global scope is special because the variables are always accessible — from anywhere and at any time. And in an interactive context, they don’t even have an “end”. Soft scope is problematic because the meaning of the code itself may change drastically based upon the existence of a global. For example, does this code throw an error?

for i in 1:4
    y = -i
    @show sqrt(y)
end

With soft scope, the answer is a great big ¯\_(ツ)_/¯. But it’s just so darn useful interactively. So that’s the compromise.

Here's that wat interactively
julia> for i in 1:4
           y = -i
           @show sqrt(y)
       end
ERROR: DomainError with -1.0:
sqrt was called with a negative real argument but will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
# ...

julia> y::Complex{Float64} = 0
0

julia> for i in 1:4
           y = -i
           @show sqrt(y)
       end
sqrt(y) = 0.0 + 1.0im
sqrt(y) = 0.0 + 1.4142135623730951im
sqrt(y) = 0.0 + 1.7320508075688772im
sqrt(y) = 0.0 + 2.0im

You can even invert the problem:

julia> for i in 1:4
           z = -i + 0.0im
           @show sqrt(z)
       end
sqrt(z) = 0.0 + 1.0im
sqrt(z) = 0.0 + 1.4142135623730951im
sqrt(z) = 0.0 + 1.7320508075688772im
sqrt(z) = 0.0 + 2.0im

julia> z::Int = 0
0

julia> for i in 1:4
           z = -i + 0.0im
           @show sqrt(z)
       end
ERROR: DomainError with -1.0:
sqrt was called with a negative real argument but will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
# ...

So, yes, this affects performance. But more importantly, it affects the meaning of the code you write. And that’s really the entire game.

7 Likes

But we accept that behavior in nested local scopes. Global variables are contained within MyModule or Main and local variables are contained within myfunction or for. I don’t see what makes it more dangerous in a global context.

Arguably a persistent REPL is the most likely place for this kind of error to occur, but that is also precisely where it is allowed for convenience. What makes a global script the most dangerous location for this kind of error such that redefinition has been disallowed there?

What makes it different is that there’s a very concrete and well-defined “lifetime” for all local variables. Their scopes have a literal end. I can’t reach into some already-defined function and fiddle about with (or even see!) its local variables. They might not even exist in the generated code! Both the compiler and I can look through the entirety of the local scope to see what the definition of the variables are, and we can both be confident in our answers — at least with regard to its type, anyhow. Julia failing to understand how local variables are re-assigned and emitting overly pessimistic code is at the root of one of the most infamous performance issues these days — so infamous that I have its issue number memorized: Julia#15276.

Conversely, I can even add new globals to Base itself at any time with an @eval Base x = nothing. Yeah, that’s obviously a bad idea, but any innocuous function can create a global the first time it is run (not when it’s defined, mind you) just by having a global x definition in it.

4 Likes

Unlike a local scope, a global scope can be split across different files, which makes it a lot harder to tell at a first glance whether a variable assigned inside a block is a new local or an existing global under soft-scope behavior. Nobody wants to write a bunch of top-level loops and try-catches in a new file (like the for-loop part of the docs’ example) then realize another earlier-included file possibly written by someone else (like the x=123 part of the docs’ example) already made a bunch of global variables with the same names. That file with global names could also have been the new one, in which case it reverses to nobody wanting to run into a file with a bunch of loops and try-catches that used to assign locals but ended up reassigning your new globals; this might be the worse scenario because it’s much harder for an interactive session to find variables in local scopes than global variables in modules. It’s not reasonable to frequently search an arbitrary number of files to compute a list of allowed names in a module, so hard scope behavior exists to reasonably isolate local scopes from the global scope. Granted, it doesn’t protect you from all name sharing; you can still accidentally reassign global variables in the global scope, it’s just rarer to make even 1 new global than a bunch of new locals in a block.

In the contexts where soft scope deviates from hard scope behavior, you have 1) a running session where you could immediately check for a global variable, and 2) the code is usually contained in one place like the REPL history or a notebook (though you can include source files with global variables and still run into unexpected behavior). While that helps, I personally do not like the soft scope because it complicates the scope rules and makes moving code to .jl files annoying. However, people really wanted to be able to paste local scope code from files to somewhere piece by piece and look at the variable state throughout; you can’t replicate that by pasting into a let block in the REPL. Maybe if there was a solid interpreter/debugger mode for arbitrary local scopes, the soft scope wouldn’t have been created to fill Main with persistent globals, but it’s here to stay in v1. v1.9 onwards, making a throwaway module and setting it as a context helps a lot; I can almost pretend I’m stepping through a let block.

2 Likes