Another possible solution to the global scope debacle


#1

The infamous issue https://github.com/JuliaLang/julia/issues/28789 has been on most of our minds for a while now. Although I’ve stopped posting there, I can tell you I’ve been thinking about it a lot, as have Stefan and Jameson and others, and we have continued discussing it off and on. We should really try to get it right.

Last time around, we more-or-less settled on changing only the REPL, which has the main advantage of being non-breaking for most code. But after sleeping on it, it does not sit well to have different behavior in the REPL and files. So I’m coming around to another possible solution, which falls out of assuming the following two properties are mandatory:

  1. Updating global variables in loops in “just works”.
  2. The scope of variables in top-level expressions can be determined statelessly; i.e. it doesn’t depend on what has been evaluated before.

From discussions with Stefan and Jameson, the simplest solution to these constraints seems to be just making everything outside of a function global, unless it is explicitly declared with local or let (or a for loop iteration variable, since for works like "repeated let"). Another way to state the rule is that assignments can only implicitly declare local variables inside functions.

This is similar to what we had pre-0.7, but simpler since it does not depend on which global variables have been assigned already. The downside, of course, is that it makes more variables global “by default”, which can be particularly painful in long test suites, but the simplicity and possibility of local reasoning mostly make up for that in my mind.

Of course, this would be a breaking change, but we can separate what the “right thing” is from how to get there. One piece of good news is that since the 1.0 behavior has the stateless property, it should be straightforward to write a femtocleaner rule that inserts needed local declarations to retain 1.0 behavior where needed.

Thoughts?


Thoughts on eventual Julia 2.0 transition
Global variables not visible within loops in scripts
Bug with Julia 1.0.1
New scope solution
Understanding while loop UndedVarError and scope
New scope solution
New scope solution
Confused about global vs local scoping in for loops in 1.0
UndefVarError: N not defined
Eventual potential scoping rules -- 2 variants
Understanding while loop UndedVarError and scope
New scope solution
#2

I don’t think we can completely separate them, since fixing this is fairly urgent and Julia 2.0 is probably several years away at this point.

Is the short-term plan still to change this only in interactive contexts (REPL, IJulia, …) in Julia 1.1?


#3

That’s a good point; if we like this design we could indeed enable it only in the REPL first, and roll it out for files later. The key point is that going back to the exact 0.6 behavior is not ideal, since that works fine in the REPL but very badly in files.


#4

It may be necessary to make an exception to the breaking changes rule for this one thing since the issue keeps coming up very often. There are a couple of approaches to doing that. One possibility is to do the following:

  1. Julia 1.1: unannotated assignment still introduces a new local but produces a deprecation warning; use local or global to silence the warning.
  2. Julia 1.2: unannotated assignment is a syntax error, indicating that it will assign to a global variable in the next version of Julia; use local or global to make code work as desired.
  3. Julia 1.3: unannotated assignment assigns to a global; use local to get a local variable.

This is more steps than we would have taken to make such a change in the past, but since we’re not supposed to be breaking things at all, it seems better to be conservative.


#5

Note that this change can be FemtoCleaned very easily since it’s purely syntactic, but we’re still worried about end-user code, which would be affected by this.


#6

I absolutely love the simplicity of this idea. No need to explain scope in an introductory level course for non-CS students, and when it does come up you can explain scoping behavior without even using the word. Perfect.

But I’d prefer to rip off the band aid and have it fully implemented in 1.1 or 1.2 at the latest. I understand the desire to be conservative given the promise of no breaking changes, but you can make the argument that bugs should be fixed ASAP even if the fix breaks stuff. And even if this behavior was 100% deliberate, many Julia newbies (and their teachers) experience it as a bug. The fact that code can be autoupdated reduces the need for conservatism in this case IMO.


#7

To clarify (sorry if this is obvious) - if I’m inside a module and declare a variable outside of a function, does that then become a global variable if someone does using MyModule? Or is it “global” only with respect to the code in MyModule?


#8

I’m afraid I don’t think that’s an option given the commitment we’ve made to not breaking user code in Julia 1.x; even deprecations are really stretching it. We could, however, make a hard switch in interactive contexts like the REPL and IJulia, since we were planning on making a hard switch there anyway by silently reintroducing the old soft scope behavior. In scripts and modules, users of 1.1 and 1.2 would, however, need to explicitly annotate assignments in top-level scopes with either global or local.


#9

This would not affect using or what is exported.

Put differently, this does not change the behavior of global variables, it just changes which variables are considered global as opposed to local to some block.


#10

I’m a bit unclear on what you’re asking but it does not sound related. The only thing this affects is assignments to variables in scope-introducing constructs outside of function bodies. For example:

# fresh REPL session

julia> for _ = 1:1
           t = "something"
       end

julia> t # what is the value of t here?

julia> t = 0;

julia> for _ = 1:1
           t = "something"
       end

julia> t # what is the value of t here?

It has nothing to do with using or modules.


#11

Knowing that:

function f()
    for i in 1:10
        if i == 1
            t = 10
        else
            t += 10
        end
    end
end

would error on calling f() what would be the behavior of:

for i in 1:10
    if i == 1
        t = 10
    else
        t += 10
    end
end

if it is run in global scope and t was not assigned any global value before running this loop?


#12

Those would behave differently: in a function body, there would be an undefined variable error; in top-level scope, there would be no error. That’s a conscious tradeoff to make the meaning of the code statically resolvable.


#13

Maybe the answer you’re looking for is: Each module has its own global scope. There is no global-global scope. If you want to make a global variable of a module available with using then you need to export it.


#14

The other difference is (if I should move with further questions to GitHub issue please let me know) is what would be the behavior of:

v = []
for i in 1:10
    g() = i
    push!(v, g)
end
v

I guess g would be defined in global scope and calling g() would return 10, but calling v[i]() also would return 10 for any valid i.

However, if you wrapped this code in a function v[i]() would return i.

Is there a list of such differences somewhere?


#15

Although it could be done either way, I would propose that for is like let, and for i constitutes an explicit declaration of a local, so loop iteration variables would continue to be local.

If we settled on a final design we could write up a full list of differences, but I would really encourage understanding it based on a minimal explanation rather than a random-looking list of examples. The rule is that some variables are explicitly declared (global, local, for, let) and in those cases it’s obvious. But for variables not explicitly declared, we default to global outside a function and local inside a function.


#16

I’m myself not sure whether the following is a good idea, but it would be another option that maybe at least should be contemplated: decide to release julia 2.0 very soon, instead of julia 1.1, and put a proper fix in. That would mean you technically honored semantic versioning and the “we don’t break things in 1.x” promise, while still putting this behind us as quickly as possible. Yes, it would be a deviation of the original plan, but sometimes annoying things happen, and at least from my point of view this would ok. Maybe one useful exercise would be to try to think of user types/groups/categories that would be really harmed by this strategy. I have a hard time coming up with one, but I also don’t really have a good overview of the julia user base, so this might well just be an insane idea :slight_smile: But at least for my requirements, I would prefer that over some complicated, multi-version change story. I guess one reason for that is that at least for us, we are really just in the middle of transitioning to julia 1.0, and as far as I can tell, this really wouldn’t be very disruptive to us at all.


#17

I tried :smile: not to be random - it is not related to rebinding of i but to how global keyword changes the behavior of the loop in top-level scope because with global function g gets introduced to a global method table:

julia> v = []
0-element Array{Any,1}

julia> for i in 1:2
           g() = i
           push!(v, g)
       end

julia> for i in 3:4
           global g() = i
           push!(v, g)
       end

julia> v
4-element Array{Any,1}:
 getfield(Main, Symbol("#g#3")){Int64}(1)
 getfield(Main, Symbol("#g#3")){Int64}(2)
 g
 g

julia> g()
4

julia> v[1]()
1

julia> v[2]()
2

julia> v[3]()
4

julia> v[4]()
4

or I am missing something?


#18

I think it’s fairly obvious that what you’re proposing would be far more damaging to the perception of the language then having a single, well communicated, carefully executed exception to the “no breaking changes” semantic versioning promise. The ultimate purpose of semantic versioning is communication between developers and users; as long as we communicate this change well enough through other channels it should be fine.


#19

Agree. Another way to look at it is: having both Julia 1.x and 2.x out there in the world has a cost. People need to spend more time dealing with the existence of incompatible versions. (I don’t want to dwell on it, but of course the python 2/3 split is the canonical example.) To pay that cost, there has to be some balancing benefit: Julia 2.0 has to be significantly better than 1.0 to make it worthwhile. While this scope change is highly desired by many people, I don’t think it meets that bar.


#20

Another thing I think should be done is to have enough time between the 1.x versions to allow most people to notice the change, even when not reading release notes and documentation or this forum/slack/etc regularly, but nonetheless updating their Julia version fairly often. I think the .x releases where on a quarterly basis, if I recall that thread by you about future versions correctly?