New scope solution

Have we ever consider the option of supporting both 0.6 and 1.0 scoping behavior via a command line argument?

Looks like c++ compiler took a similar path as explained in the Proposal: Go 2 transition:

An example of a breaking change in C++ was the change in the scope of a variable declared in the initialization statement of a for loop. In the pre-standard versions of C++, the scope of the variable extended to the end of the enclosing block, as though it were declared immediately before the for loop. During the development of the first C++ standard, C++98, this was changed so that the scope was only within the for loop itself. Compilers adapted by introducing options like -ffor-scope so that users could control the expected scope of the variable (for a period of time, when compiling with neither -ffor-scope nor -fno-for-scope , the GCC compiler used the old scope but warned about any code that relied on it).

1 Like

The main objection to the current behavior is for interactive use, and this makes that worse…and in fact it would make interactive shells effectively unusable compared to competing languages.

7 Likes

FWIW I don’t believe so. The snippet you quote is - what I think - the form ~everybody expects to work. It’s a distraction if, e.g. in teaching, beginners must ‘just be told’ about global. In addition stackoverflow, blog and other code will be found by google which does not use global because of the SoftHardScope hack. Thus you first have to learn that in Julia for your quoted loop you need to write ‘global’ and then additionally learn that with ‘some hack’ you can leave it out again (or that there is a command line flag to modify behaviour). Doesn’t give a good impression imho.

For me it seems that the first (global) or especially this second (local) solution work great in general. I don’t think it would be that bad if in some edge cases it is maybe not totally obvious at first sight without looking at the rules or without asking about interpretation, a beginner most likely would not run across it.

If there is the potential for confusion (or if the code is too long) it might be good practice to annotate with local or global anyway. This is the same as with complicated conditions where using (optional) brackets does clarify what ‘belongs together’.

3 Likes

One of the problems with the 0.6 scoping rules was that they were too context dependent, which made scope tricky to reason about. I fear that the proposed solution, clever as it is, is a step back in that direction. I think it would be much better to simply make variables within for and while loops always have outer scope unless explicitly marked as local. To me the concern about accidentally littering the namespace with in-loop variables is not that compelling; a decent linter could help with that.

2 Likes

Is it possible to exclude let? It will be very useful for requesting a clean namespace and “plain” scoping rule. I know I can write

(function()
    x = 0
    for i in 1:10
        x += i
    end
    x
end)()

but it feels like I’m writing javascript.

1 Like

Fair enough. Now, I’m wondering if @jeff.bezanson’s solution from the other thread is still on the table at all, perhaps with the addition of local let as per @tim.holy? Looking at that thread, it seems that there were fewer objections, at least concrete ones with counterexamples.

Could we have a branch / branches with the proposed behavior(s) for people to play around with before a final decision is made?

2 Likes

Just to clarify a point that I feel bears repeating: when we talk about global variables, we’re not actually talking about “global” variables in the traditional sense. Rather, this is more like a discussion about “module”-scoped vs “function”-scoped vs “block”-scoped variables…not sure if this says anything about a potential solution, but I feel like keeping this facet of the issue in mind is important in arriving at the “right” solution.

2 Likes

Possibly, but AFAIK in the documentation “global scope” is used consistently, so I don’t think that this is a frequent source of misunderstanding.

I wasn’t meaning to suggest it as a point of confusion but rather as another thing to consider when contemplating solutions.

To take @tim.holy’s example of shadowing first with an iteration variable, while doing so “globally” would potentially be disastrous to other code in the same module, unless the module imported first, the damage would be limited to the module’s scope (as opposed to ruining the entire runtime, as would be the case if “global” was…global).

So, for example, if static analysis of a module determined that there were no uses of first, then shadowing first “globally” in a for loop could be done without consequence. I’m not suggesting that this is a solution, but maybe some creative thinking in this direction is warranted?

In other words, since “global” isn’t global, there is still the possibility of promoting things to global while still containing the side-effects of doing so.

1 Like

You can’t really do that analysis for the REPL. For example, quick quiz, if you have two statements, first = true and first([1,2,3]), what happens when you try them both? What if you swap the order in which you try them?

Answer (if you’re playing along, no peeking): whichever one you try first succeeds. Then the other one throws an error, but the error depends on the order you selected.

Consequence: since no amount of analysis can guess what the user will do next, one can argue it’s undesirable to assign more variables than necessary at the REPL. Damage can be repaired (e.g., first = Base.first), but having to do that manually is a pain. (“Drat, which package is foo defined in again?”) As a consequence I do a fair amount of playing around inside of let blocks, because I usually try to keep my REPL session going as long as is practical.

I’m not trying to say it would be fatal if we have to give this up, I’m just wanting to make sure the consequences of our choices are clear.

4 Likes

Precisely! Evaluation at the REPL occurs in a module scope, but it is, in effect, a never-ending module. Thus, certain whole-module analysis style solutions are unavailable to code at the REPL. One might reasonably ask, “Why must REPL evaluation occur within a module scope?” I’m not sure the Julia-specific answer, but for every other language I’ve used with a REPL, the answer is usually something along the lines of: “Well, what other scope are you going to use?”

But maybe that’s the real problem…maybe we shouldn’t be trying to solve UX at the REPL with the same hammer as used to keep code within functions and modules clean. What if, for example, we introduced a script construct that had its own scoping semantics tailored toward REPL sessions and top-level script files?

Maybe within a script scope we track all variable shadowings so that, instead of having to do first = Base.first we could just do a pop_binding(first) (similar to the old workspace() call…which, for the record, I still miss :confused:). Also, as a potential added benefit, adding a script scope for REPL sessions could probably done in v1.X without having to sacrifice semver (though, making it the default for top-level scripts might violate strict semver).

2 Likes

Is this a bug then?

julia> begin
       @goto l2
       @label l1
       @show :call
       first([1,2,3])
       @goto end_
       @label l2
       @show :assign
       first = true
       @goto l1
       @label end_
       end
:assign = :assign
ERROR: cannot assign variable Base.first from module Main

julia> begin
       #@goto l2
       @label l1
       @show :call
       first([1,2,3])
       #@goto end_
       @label l2
       @show :assign
       first = true
       #@goto l1
       @label end_
       end
:call = :call
:assign = :assign
ERROR: cannot assign variable Base.first from module Main
1 Like

Cute example. I don’t think I’d call that a bug, unless you’d say that the cannot assign variable Base.first from module Main is itself a bug. Given that this is something we can’t do, I think your example is validly dependent on details of the interpreter.

1 Like

i don’t have an opinion on how scopes should work, but I do believe that in order to accommodate new users and classroom use, it is important to have insightful error messages. For example, Matlab has an obscure scope rule stating that in breakpoint mode inside a function, you aren’t allowed to introduce a new name in the case that the function itself has a nested function. See below for an example, and notice the detailed error message that links to a section of the documentation. So even if I had never heard about scopes and was new to Matlab, I would not be too put off by this occurrence.

>> type outer_function
function x = outer_function()
    function y = inner_function()
        y = 7;
    end
a = inner_function();
keyboard
x = a;
end
>> x = outer_function()
K>> z = 12;
Attempt to add "z" to a static workspace.
 See Variables in Nested and Anonymous Functions.
K>> dbcont
x =
     7

The discussion has gotten pretty off-topic, but I still like this proposal and should mention a property that I find really appealing about it. To understand whether an unannotated variable is local or global in a top-level scope, you just need to scan the block from top to bottom and find the first usage of it in that scope: if it’s a read (including updating operators like +=) then the variable is global, if it’s a write then the variable is local. This is because the rule requires all paths to match, so finding any first usage is sufficient, including the first one from the top. Requiring all paths to agree is just to avoid situations where generally valid code transformations would otherwise cause the code to change behavior. For example, negating a condition and swapping the if and else branches should not change the behavior of code. Requiring all paths to match ensures that it doesn’t.

I should also point out that it is breaking in a very specific and inoffensive way: code that would cause a parse error and require an explicit local or global annotation with this change, is almost certainly buggy. This is because the “otherwise” clause in the rule has to be one of these two cases:

  • the first use of x is a write on one path and a read on another;
  • x is only written.

In both cases x is currently local. In the first case, if the path where x is read first is taken, then there will be an undefined variable error. In the second case, there will be no error, but either the write is intended to modify a global x which it won’t actually do since x is local; or the assignment to x is left over from a time when there was a read somewhere in the scope that has since been deleted, in which case the write is vestigial and should be removed. Either way, it’s a bug and this change should reveal it. So it’s pretty hard to get too upset about the kind of “breaking” this change would cause—it would mostly result in fixed bugs.

14 Likes

This is important. I do not have a strong opinion on how scope needs to be done, but I do believe that the rule needs to be practical (I know how to check/use it in my code).

A clarifying question: suppose a user pastes (which is one of the use-cases frequently invoked when talking about this issue) the following code into the REPL:

for i in 1:n
    j += i
end

and tries to run in, only to realize that j and n are undefined, which results in an error because the user failed to paste the line

j = 0

and initialize the n (which was supposed to be a function argument).

Is the global/local property of j and n fixed at this time (as global) by the attempted evaluation, despite the error? Or does it remain unknown, and will be established when the user runs the code with these variables initialized?

3 Likes

I just hope that whatever the final solution is, it does not require looking after a statement/expression in order to figure out what the meaning of a variable within the statement/expression.
One of the worst things IMO about pre v1.0 Julia was the “spooky action at a distance”, where adding a statement pages further down could affect the meaning of a variable.
I am so happy that the scope rules were changed, and I’m thankful that Stefan & others are attempting to come up with a reasonable solution for the REPL, as long as that “spooky action” and weird corner cases are not reintroduced.

2 Likes

Nothing is affected by attempted evaluation. The whole point of moving away from the 0.6 design is that the meaning of code does not depend on global state of what global variables are defined or “exist”. The code means something independent of whether j and n are defined, if it uses them as globals and they aren’t defined, that will be an error, of course.

8 Likes

May I ask what is the rough timeline for this change? I am asking because I want to give a Julia tutorial in my school and if this change won’t take too long, I will wait.