Another possible solution to the global scope debacle

Would it be possible to just re-introduce the warning/error messages in the REPL but otherwise keep the new behaviour from #19324?

In this way, the examples (IanNZ / 28789) loose their gravity as the error/warning message just tells me what to do, i.e. prepend the variable with global. The silent failure gives a warning, the ‘real’ failure an error, for example:

julia> for i in 1:2
         beforefor = false
       end
┌ Warning: `implicit assignment to global variable `beforefor``.
│ Use `global beforefor` instead.
└

julia> for file in list_of_files
         # fake read file
         lines_in_file = 5
         total_lines += lines_in_file
       end
┌ Error: `implicit assignment to global variable `total_lines``.
│ Use `global total_lines` instead.
└

I’m not sure I like (yet?) the need to write global at some places but on the other hand it seems a coherent rule to access global variables and loosing the hard/soft scope distinction was nice (this distinction has also been criticized). But more importantly, I think the current behaviour has been choosen and I don’t hope that there is now a rush to change it again (before 2.0 and after careful consideration)!

8 Likes

Just wanted to add my support for @jeff.bezanson’s scoping rule proposed here. I also think it’s reasonable to make an exception to the promise not to introduce breaking changes and roll this change out in some version of 1.x.

2 Likes

I was asking whether, in a common case where people would be operating on some variables in a loop at the REPL, e.g.

B = loadsomedata("data")
s = 0
for i = 1:length(B)
    s += somefunction(B[i])
end

if that would become slow compared to 0.6 - 0.7, given that more variables will be treated as globals.

I have been thinking about this rule since I read the post this morning, and I think it would be a nice solution, and handle the various corner cases in a way that is both

  1. easy to reason about statically (just from looking at the code),
  2. does not depend on the variable being defined,
  3. is a good approximation of what people like to see intuitively.

It would be great if someone could prepare a table of various relevant mini-examples, and discuss

  1. what 0.6 did (optionally, but it would be helpful),
  2. what happens in 1.0,
  3. what the rule proposes.

Eventually some of these could end up in the documentation.

Eg to continue #28789 (because it is locked), my understanding is that under the new rule,

julia> for i in 1:2
         beforefor = false
       end

julia> beforefor
false

regardless whether beforefor was assigned before;

and scope inside functions would just work as in 1.0. Is this correct?

(Also, now that a proposal is in sight and people have calmed down, could we you please unlock the issue?)

7 Likes

I don’t have a super strong opinion either way on what the right forward course is for this problem (part of me really likes the current behavior), but the suggestion here definitely seems reasonable, so I’m not opposed to it.

That said, I’m strongly against making any breaking change and continuing to call the language Julia 1.x rather than Julia 2.x. Sem ver really is just supposed to be a way of communicating with users, and if the 1.x series of some software has a misfeature that is noticed as quickly as this global scope debacle has been, it’s a very reasonable approach to release a 2.0 version relatively quickly. That’s why we have semver in the first place to facilitate clean communication of things like this; if we continue to call software with a breaking feature change by the same major version, we lose most all the benefit of semver. Basically, we have to release a new version of Julia to fix this no matter what; we should call the new breaking Julia version by its proper semver name: 2.0.

If a quick 2.0 is really completely off the table, I think @swissr’s suggestion is the most reasonable way forward (I’m definitely against introducing differences between how the REPL behaves and how scripts behave).

11 Likes

Just to be clear, IJulia already does a variant of this (automatically inserting global keywords as needed in global scope). So far there has not been a single complaint that this is different from scripts/files. Possibly this is because once you get to the point of putting code into a file you are usually writing functions.

So I don’t think it would be a big problem if the new behavior were in the REPL in 1.1 and opt-in elsewhere. Certainly we would get fewer confused queries than now, especially if the error message for undefined variables in global scopes is improved also.

5 Likes

In that code in the REPL, B and s would be global in all versions of julia. The only small difference is in 1.0, which gives an error, and you have to write global s += somefuction(B[i]). Making s local to the loop would not work, since then you couldn’t update the global s you want to use outside the loop.

I see thanks. I though 0.6 might have been using a trick to “make the globals local” in order to act on them in the loop efficiently.

If you have the time then, would you mind explaining how the proposed scheme is different from 0.6?

In 0.6:

julia> x = 0;

julia> for i = 1:2
         x = i
       end

julia> x
2

julia> for i = 1:2
         y = i
       end

julia> y
ERROR: UndefVarError: y not defined

Here y was local to the loop since no global y had been assigned yet. In the proposal in this thread, y in the loop would also be global.

4 Likes

I have a suggestion: would it be possible to refine the terminology? The way I understand the issue is that there are variables that come into local scopes from enclosing scopes. They are not necessarily global, they are just not local to the interior scope. If I’m right, it might be worthwhile not to refer to such variables as “global”, but something else (outer?).

Yes, if you’re just talking about referencing a variable from an enclosing scope, it’s best to call it it an outer variable. However this issue truly only affects global variables. Nothing has changed or will change about how accessing variables in outer scopes works if the variables are local.

I’m hesitant to express an opinion that is clearly contrary to the mainstream here, but I really like the way 1.0 works in regards to scope. It has helped me develop a sense of code isolation, to the point of looking at

julia> x = 0;

julia> for i = 1:2
         x = i
       end

julia> x

and immediately expecting to get 0. I want isolation as much as possible, and a clear notion that global bindings is not something to use as a work registers, counters, etc. Globals are to be feared, in a way, and to be left untouched once they are defined. I somehow like this notion, and it has served me well so far. This is just a matter of education, not design, I think. So, for what it’s worth, I cast my vote to actually do nothing about this issue.

22 Likes

For anything serious, wouldn’t you be putting everything in a function anyway?

Oops, sorry. I thought the issue was to unify treatment of global variables and variables in outer scopes relative to inner scopes.

I agree with this, but then

Julia 2.0 is probably several years away at this point.

suggests that our hands are effectively tied by some general sense that we can’t release new versions before some designated time has passed, which puts us at an extreme disadvantage of our own making.

We should adopt one, but not both, of these guidelines.

(Edited to add: please feel free to split this discussion out of this main thread if it’s distracting, but I’m really interested in this issue since we’re facing it in LightGraphs as well. SemVer isn’t time-based, it’s feature/contract-based, and there’s this strong tendency to conflate the two properties.)

7 Likes

What you want to do here is clearly:

B = loadsomedata("data")
s = 0
for i = 1:length(B)
    global s += somefunction(B[i])
end

In this example, if s is local then the code doesn’t work, so that’s not what you intend (there are other situations where it would make sense for s to be local, however). Regardless of how you write that, it has the same performance—because it’s doing the same exact thing. The only question is whether you need to write the global keyword or not (and what the rules are about when you do or don’t need to write global or local in general). There’s no way that different syntax can make the same code faster or slower. @Liso brought that up as a possibility, but it’s not a possibility, this discussion does not have any relevance to performance.

Violating SemVer is bad, it breaks the semver promises about compact.

I think anyone who follows closely knew there would be problems that couldn’t b e fixed without breaking changes.
And part of maturing into a 1.0 version is becoming comfortable that no-longer can we just fix them whenever we want.
That time is behind us now.
And that is OK.

I am much more comfortable with the idea of whatever behaviour being on be default in the REPL,
And opt in, everywhere else.
Using Future: scoping
Then in 2.0 we can have the nice solution.

The REPL already behaves a little different since it implicitly using InteractiveUtils as well as Base and Core.
That actually catch’s me out surprisingly often.

5 Likes

To understand the issues a little bit better: when we say global, do we mean variables defined at the top level (REPL), or variables defined outside of functions?

In a way, it seems to me that variables defined outside of functions are really of the same kind, no matter whether they are defined in the REPL or in files (modules). These variables are always defined in modules. (The top-level variables are defined in Main, aren’t they?)

1 Like

As a matter of pure syntax, this is the most automatically fixable thing possible. We can completely automatically fix this with 100% accuracy in every registered Julia package. Just rewrite every instance of top-level scope code that depends on the 1.0 behavior to have an explicit local annotation. People seem to mostly agree that changing this in the REPL and IJulia would be fine (in fact, it’s already changed in IJulia via SoftGlobalScope.jl). That leaves only the case of non-interactive scripts that have been written since 1.0 was released, which strikes me, as I said, as an small risk, especially if we do a 1.1 warning followed by a 1.2 error, followed by actually changing the behavior in 1.3. Even for that small sliver of cases, this change would leave most scripts still working correctly, but with a few more global bindings than intended. You have to use a name as a global, then use it as a local in a top-level non-function scope constrict (intentionally—some cases will be accidental, in which case this will fix a bug), and then access the global again, expecting it to have the value it had before the middle usage. Changing in the other direction—what the 0.6 to 1.0 change did—was was more likely to break things.

Variables defined outside functions are all global, whether in the REPL or not. We’re just considering using different syntax in the REPL for greater interactive convenience without breaking larger programs.