Another possible solution to the global scope debacle

I may misunderstand this, but I don’t think the tone was hostile at any point.

Please keep in mind that many people here teach or taught programming to non-CS audiences, on both sides of this issue.

1 Like

I agree with your first two points. This last one is something which was suggested here to be solved by loading SoftGlobalScope.jl or similar by default. However, from what I understood from the discussion it is not clear to me if that would really solve everything or what pitfalls there might be with that approach and how this might interact with semantic versioning. I will await the result of the discussion and see what to do then.

Sorry, but I think this is not correct.

While the second let block returns the value 2 indeed,
asking for y still gives ERROR: UndefVarError: y not defined

First of all, thank you @jeff.bezanson (and the other core devs of course) for spending a lot of time and effort on this issue.

Concerning feedback on the proposed solutions, I like that:

  • It is very consistent and easy to explain. IIUC it says that: “things outside of functions are global unless marked as local explicitly”
  • It is IMO quite intuitive for users novel to programming (as opposed to the naive but quite common “I have defined x here, why does it say it’s undefined?”)
  • It only special cases functions, which is nice as they are arguably the most fundamental Julia construct

Things that could be added (already mentioned above):

  • A fix to for outer i = ... in top level scope, now it errors
  • local for ... and local let ... as in some cases the local default seems preferable (especially in “production code” or modules)

The second point is particularly relevant in a transition period as the warning would go from:

In the future in top-level for blocks variable assignment will default to global.
Annotate all assignment explicitly as local or global.

to

In the future in top-level for blocks variable assignment will default to global.
Use local for to retain the old behavior.

However it seems to me that this change is a bit drastic (especially as the proposed option hasn’t been tested yet) and I have the impression that is not necessarily an improvement in module code. For example, IIUC, with this proposal if I add a for loop at top level in a module, all the variables that I use will become global variables of the module and accessible from outside and I also risk to overwrite relevant globals that I am exporting.

With this in mind, I wonder whether the solution proposed in this github comment would be better: hard scope in modules (that are “production code”) and soft scope otherwise (scripts and REPL). The inconsistency may be annoying but by the time a user can write a module, I’m sure he/she can deal with this scope subtlety. In this scenario though what a “script” does when executed is a bit tricky to decide but one could opt in or out of this behavior by adding a line at the beginning of the file, say softscope(true) or something similar.

Concerning the wave of questions about this on discourse / slack / stack overflow etc., I wanted to point out that this could probably be reduced by a better error message. Right now one gets:

julia> x = 1
1

julia> for i = 1:10
       x = x + 1
       end
ERROR: UndefVarError: x not defined

Whereas something like:

x inside the for loop refers to a local variable, use global x to modify global variable x

would probably be better.

23 Likes

This is what I have suggested in my post. However in the meantime I think I have changed my mind and prefer the solution proposed in this thread:

  • no global annotation needed, IJulia could drop the ‘automatically-insert-global-hack’ (hopefully I’m not wrong here, didn’t use IJulia in 1.0 yet). REPL doesn’t ‘need’ a similar ‘~hack’.
  • no (?) other language has such scoping rules in the for loop (is this true?)
  • no surprises (1. there was a mention from a Matlab user at the breakfast table: you have to fix this – I think this is strong signal and bears weight, 2. there was a ‘bug report’ exactly about this problem – another signal of unexpectedness)
  • advanced users are more fit to declare what they want, i.e. write local when needed (local for is a cool idea, btw.)
  • copy/paste from REPL to functions works right away without ‘global adjustment’
  • former hard/soft scope complication is still solved

The only price is that the for scope is now global ([Edit]: maybe this is really bad, I’m not sure). – This said, I don’t feel competent enough to really state an opinion…

3 Likes

Yes, I’m referring to the behaviour under the proposed solution; not Julia 1.0, which I find more consistent.

You are right.
Sorry for the misunderstanding.

As a new Julia user, I would agree that the message should be more descriptive to enable user understand what’s the issue and how to resolve rather than just a standard error message leaving the user perplexed about what to do.

5 Likes

Maybe just as a reminder (from “Why we created Julia”):

Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

This is something i can subscribe to and for me the first “simple to learn” also includes “simple to explain/teach”.

5 Likes

I’m probably not adding much to the signal:noise ratio here but I think this comment is right on the money

While I like the original proposal (thanks very much to the core devs thinking this through!), the route to get there looks quite painful with the potential to break a lot of code through the deprecation process. While it should be an easy change to fix, I think we’ve seen that a lot of people don’t make use of the available tools to automatically apply fixes (i.e., FemtoCleaner) and so it could well cause more pain than it’s worth.

While I’d very much like to have the original proposal implemented, I suspect a more fruitful/less painful approach would be to implement the error message as mentioned by @piever and also include an @global macro that mirrors the global statement in that it simply makes all assignments in the annotated code block global (unless previously annotated as local). This would be very similar to the @softscope macro but slightly simpler in form.

In this case we might have the following workflow -

julia> x = 1                                      
1                                                 
                                                  
julia> for i = 1:10                               
       x = x + 1                                  
       end                                        
ERROR: UndefVarError: x not defined; x inside the for loop refers to a local variable, use 
`global x` or `@global for` to modify global variable x.
Stacktrace:                                       
 [1] top-level scope at .\REPL[3]:2 [inlined]     
 [2] top-level scope at .\none:0                  
                                                  
julia> @global for i = 1:10                    
       x = x + 1                                  
       end                                        
                                                  
julia> x                                          
11                                                

I’d expect IJulia workbooks to continue using SoftGlobalScope (or the original proposal in this thread) but that’s fine since workbooks are a very different beast.

9 Likes

While a more informative error message would be an improvement, it doesn’t fix the problems that

  • The very first time you write an interactive loop you have to understand the distinction between local and global scopes. This will confuse and turn off a lot of potential users.
    • This is a huge problem for pedagogy in a non-CS context where you just want to use Julia interactively. Imagine trying to teach statistics and having to stop in the middle of the lecture to explain scope.
    • For a large number of new users it will make Julia seem pointlessly picky compared to any other interactive language.
  • It’s a big annoyance even for experienced users, because it makes interactive Julia code harder to write, and makes it harder to paste code in functions to/from the REPL to try things out.

The good news is that, since the problem with global scoping semantics mainly arises in interactive contexts, we can initially improve matters in a non-breaking way by implementing this only in the REPL and other opt-in contexts.

25 Likes

FWIW I much prefer the proposed behavior to the 1.0 behavior. For one, I think most commonly used languages have soft scope for loops, which makes Julia’s 1.0 behavior quite unintuitive. Secondly, I think the soft scope better fits the most natural, most common use case: set up a variable, iterate some operation on it, then do something with the result.

3 Likes

It looks like many people (including me) are happy with the new for scoping rule. But what about let?

Keeping the consistency between for and let means that forgetting a single comma would alter the program in a subtle way, right?:

julia> x = 1
       y = 2;

julia> let x = 10,
           y = 20
       end

julia> x, y
(1, 2)

julia> let x = 10  # no comma
           y = 20
       end
20

julia> x, y  # it is (1, 2) in v1.0
(1, 20)

I know this wouldn’t be the only place where a single comma is important. I’m just trying to understand the consequence.

This also breaks the natural expectation that let; ... end is equivalent to (() -> begin ... end)().

Just to add a small dose of (non-Matlab) prior art… Ruby had to deal with the question of shadowing scopes in lambda’s a while back. Their solution was to add an optional declaration that would prevent variable assignment within the lambda scope from overwriting a variable from outside the lambda scope. For example:

noshadow = ->(x) { puts "'a' is #{a}"; a = x }
shadow = ->(x; a) { puts "'a' is #{a}"; a = x }

a = 1 #=> 1
noshadow.(2) #=> 'a' is 1
a #=> 2
shadow.(3) #=> 'a' is
a #=> 2

Note that in the second case, 'a' is is displayed because a is null at the point of the print statement.

A few nice features of this:

  • if you don’t know about Ruby lambda’s introducing their own scope, then everything seems to work as expected with variables from outside the scope being captured by and modifiable from within the lambda
  • you can write “safe” lambda’s where you don’t have to worry about unintentionally affecting the calling scope
  • the shadow declaration initializes the variable, so you can reference it before first assignment (without the (...; a) part of the declaration, the puts statement would error on an undefined variable)

One thing that is very different for this Ruby case and what we’re discussing with Julia, however, is that absent the existence of a in the enclosing scope, any a = x assignment within the lambda will only declare a scope-local variable. In other words:

l = -> (x) { a = x }
l.(2)
a #=> undefined local variable (i.e. `a` within the lambda body was lambda-local)
a = 1
l.(2)
a #=> `a` within the lambda body this time referred to the `a` from the enclosing scope

Apologies if this is adding to the noise, but I think Ruby is, if nothing else, a very beginner friendly language and it might be useful to learn from it. That said, it may also be that some of its “beginner friendliness” comes from its (some might call it “egregious”) use of dynamic scoping…and that may not be a bridge we’re willing to cross.

2 Likes

That is an orthogonal concern. It’s just a property of Julia’s let syntax: once the commas stop you are inside the block and no longer introducing new let-bound variables. In general that will mean different behavior unless we make much more radical changes.

I don’t think it’s possible to have both formal properties like this, as well as “ergonomic” syntax optimized for convenience. (As a footnote, while that equivalence holds in Scheme it does not hold in ML-family languages.) Also, the equivalence would hold once you are in local scope, because in local scope all assignments already overwrite outer variables by default.

3 Likes

I’m curious to know why breaking consistency between let and function is preferable over breaking consistency between let and for. If you say that such consistency between let and function did not exist in the first place then I guess that’s the answer. But I also think creating “stronger” scope by let could be considered “ergonomic” as well since the reason why users would write let is to introduce a scope; so why not give them a stronger/safer one?

1 Like

I think the main thing is to have fewer exceptions. Making functions the lone exception is simpler. Also, functions are special in that they indicate an intent to create a reusable piece of code that therefore needs some extra isolation.

I wouldn’t say the purpose of let is to introduce a new scope. Rather the purpose is to create specific new variable bindings. For example this pattern is very useful:

let e = 2.7
    exp(x) = e^x
end

At the top level, having that define a global function exp is what we usually want, and we don’t get it currently (you have to write global).

Another reason is debugging by copying code from functions to the REPL (I confess I do that a lot). In 0.6 it fails only for functions with inner functions. In 1.0 it fails on all scoped constructs. It would be nice to at least go back to everything except inner functions working.

To me, it’s not enough just to guess that since somebody wrote let they want more things to be local. There would need to be a specific, useful code pattern that’s more elegant under that assumption. For example the pattern that kicked off this issue is initializing a variable, and then updating it in a loop. I’m not sure there are any similar patterns that benefit from making more variables local inside let. Given that in general we default to overwriting variables in outer scopes, it can’t be all that important for let to be special here.

11 Likes

Thanks, exp example is actually compelling. The usecase I had in mind was something like this at module level:

let # works in v1.0
    for b in 0:10
        exp = Symbol("exp", b)
        @eval $exp(x) = $b^x
        if b > 0
            invexp = Symbol("invexp", b)
            @eval $invexp(x) = $(inv(b))^x
        end
    end
end

It would be bad to have the temporary variables exp and invexp leaked out to the module’s top-level scope. Of course, one can use local or let in front of each temporary variables. But I’d say that’s more ugly.

Wouldn’t let; ... end being equivalent to (() -> begin ... end)() actually better for this? You can just wrap the code with a let block and then it’d work even for edge cases like the code with inner functions, right? (We can even have a keyboard shortcut for this.)

I’m not sure there are any similar patterns that benefit from making more variables local inside let . Given that in general we default to overwriting variables in outer scopes, it can’t be all that important for let to be special here.

Well, doesn’t this count as an example? In particular, Rebugger cannot work in 0.6 or 0.7 (because of the scope deprecation) but it does in 1.0.

3 Likes