New scope solution

If this soft global scope hack means that I won’t be able to copy some code from a test file or an examples file and paste it into the REPL or IJulia to reproduce the behavior, and vice versa, then I find this a very weird divergence of language semantics. I actually prefer the current behavior over such divergence, if votes count, and I prefer the 0.6 behavior over both.

3 Likes

I am not sure if you know it but currently you could not copy from function body to global scope in REPL and expect same behavior.

1 Like

I did not mean to copy from a function body, but to/from global scope in a test file.

It may avoid some confusion if the compiler emits a warning if the following two conditions are met:

  1. A local variable is assigned, but never read before it falls out of scope.
  2. A global variable with the same name exists.

Condition 1 would be checked statically when the block is compiled, and condition 2 would be checked at run-time.

Consider the example:

for x in 1:5
    if x == 5
        found = true
    end
end

Here, the compiler would see found as a local variable that is written but not read, and emits code that checks for the existence of a global variable and if so, gives a warning: “Did you mean global found = true?”

(If the global found is a Bool then the warning could suggest found |= true as an alternative.)

There’s still the issue that a @show found that the user puts in for debugging purposes would make the warning go away. I don’t think there’s a way to avoid this.

On a related note: How about allowing postfix if following an assignment, and making

found = true if x == 5

equivalent to

found = if x == 5
   true
else
   found
end

That would create a code-path that reads before writing, which is what we want in these cases.

4 Likes

I appreciate the effort to find a solution to a difficult problem — thanks for thinking about this!

However, I am not sure I like the fact that to reason about the global/local status of x, I have to look at surrounding code, and all the branches. IMO doing this kind of reasoning is not something that humans are particularly good at (as opposed to compilers).

I find it problematic that if I comment out some lines (which I do sometimes for debugging or WIP code), x could flip back and forth between local and global. While I recognize that it can be a pedagogical challenge in some contexts, I find the status quo of v1.0 easier to reason about.

31 Likes

It seems the options are:

  • local scope + error messages so beginners at least know whats going on
  • local scope + SoftGlobalScope (or some other tooling) by default in the REPL
  • global scope
  • global scope + local let (or equivalent) to easily create locally scoped blocks
  • DWIM scope
  • abandon Julia and go back to Python

A subset of users/devs will be unhappy with the final decision :man_shrugging:

1 Like

I’d also want better error messages in this case (in files).

This would also be my personal preference. It’s solid, non-breaking, and we don’t have to change (again) all stack overflow and discourse post answers related to this scoping issue. It would just subtly improve the situation.

6 Likes

Sometime you want to break from nested for. What about this? This is current behavior:

julia> broken = false
       for i in 1:2
            # broken = false
            for j in 1:2
                if true broken = true;break;end
            end
            println("inner $broken")
            if broken break; end
        end
        println(broken)
inner false
inner false
false

What would we expect here?

We could look at “write only”/“no use” case:

found = false
for i in 1:2 
    found = true
    @debug_test found == true   # it is used (in read mode) here if macro is expanded!
end

Debug version will be different from no-debug version!!

1 Like

I think rules should be simple and easy, this one is not! If you really feel that you need this rule, remove it for Julia 2.0 or before if possible.

7 Likes

It is quite a mess in current behavior too. Look at Schroedinger’s cat:

julia> dead = false
       for j in 1:1
           if rand()>0.5 dead = true;end  # cat is unlucky :(
           print("is shroedinger's cat dead? $dead")
       end
is shroedinger's cat dead? true

If we avoid dead cat is happy:

julia> dead = false
       for j in 1:1
           # if rand()>0.5 dead = true;end
           print("is shroedinger's cat dead? $dead")  # cat is lucky! :) 
       end
is shroedinger's cat dead? false

But in case of cat’s luckiness experiment is broken:

julia> dead = false
       for j in 1:1
           if rand()>0.5 dead = true;end   
           print("is shroedinger's cat dead? $dead")  # coder is not lucky :( 
       end
ERROR: UndefVarError: dead not defined

It seems that assignement (which not happened!) made dead variable local and undefined.

EDIT:
Could this be optimized out in future?

dead = false
for j in 1:1
  if VERSION<v"1.0" dead = true;end  # I want to check conditional programming here
  print("is shroedinger's cat dead? $dead")
end
ERROR: UndefVarError: dead not defined
1 Like

I don’t think many people want to leave things as they are.

Both in SoftGlobalScope and in a function (any local scope) your example works as expected. It would also work if everything would default to global. So this is most likely going to be fixed, (almost) independent of what change will be made.

How it could be if one proposal want to check context and context is quite questionable as could be seen from my tests too?

What does really mean that variable is not used? Or is used “write only”?

Maybe I am wrong, could you explain it more please?

Error messages: https://discourse.julialang.org/t/improving-error-messages-for-the-scoping-problem/16209.

Remember, the local/global decision for variables does not happen at runtime, it happens at compile-time (I think early in lowering?). I think Stefan’s solution is to follow unconditional @goto and always follow both branches (even for literal if false). So it would fix the following example:

julia> dead=true;
julia> let 
           @show dead
           @goto skip
           dead = false
           @label skip
       end
ERROR: UndefVarError: dead not defined

but not

julia> dead=true;
julia> let 
           @show dead
           if true @goto skip end
           dead = false
           @label skip
       end
ERROR: UndefVarError: dead not defined

The rule would be: Follow all pathes (without evaluating known conditionals). If there exists a write before read path, then the variable defaults to local. Otherwise, it defaults to global.

As a side note: The while gets evaluated in the outer scope, not the inner scope. That is probably confusing for some people as well:

julia> m=4; n=2; i=1; while i>0
       i = n
       @show i, n
       global m -= 1
       global n -= 1
       @show m,n
       m>0 || break
       end; @show m, n, i
(i, n) = (2, 2)
(m, n) = (3, 1)
(i, n) = (1, 1)
(m, n) = (2, 0)
(i, n) = (0, 0)
(m, n) = (1, -1)
(i, n) = (-1, -1)
(m, n) = (0, -2)
(m, n, i) = (0, -2, 1)

So, regardless of this scoping, a minimally invasive (very non-optimizing) @code_semilowered that produces valid julia source code with only let blocks and @goto would be nice for that. It would also teach people about the iterator interface.

1 Like

This proposed solution bears a striking resemblance to escape analysis, which is a tricky beast but also key to some seriously powerful compiler optimizations. It is also, notoriously, the one optimization that Java can still not perform (well). I bring this up because I think it is worth considering the fix to this “bug” in the larger context of escape analysis.

Java has problems with escape analysis not only because it is a difficult optimization to perform, but also because the language was not designed with it in mind. With Julia, we have the opportunity to evolve the language in a way that would facilitate escape analysis.

I think the crux of the scope “bug” is the desire to create strongly bounded scopes. We want this because it simplifies escape analysis. If we state that any variable created within a for loop, or within a function, falls out of scope at the conclusion of the loop or function body, unless returned, then we only need follow the path of explicit returns to perform escape analysis. However, if some value within one of these scopes is assigned to a global variable then we must consider multiple escape routes. Consider, for example:

b = []
function foo()
  global b
  for i = 1:10  
    append!(b, i)
  end
end
foo()

There are, in this function, 10 values that have escaped the function scope. Still, because we must specify global b, analysis is relatively straightforward. The more complicated the rules become for determining when a variable might escape a scope, the more difficult it becomes to perform escape analysis.

The REPL throws a monkey-wrench into all of this, as it is essentially a never-ending function call. Nothing can escape the REPL, so we would like to relax some of the constraints around escape analysis in the name of “user experience”. The problem, of course, is that the REPL is not a function call.


In short, I am not in favor of this proposed solution because of how it potentially complicates escape analysis. I do think, however, that it highlights one potential path toward a more general solution. What if, instead of tweaking the rules for how variables might, or might not, escape from an inner scope, we allowed for outer scopes to explicitly opt out of variable escaping? In other words, what if you could do the following:

module Foo
  locally_scoped() # => this call alters the scoping rules of the module
  b = []

  function bar()
    for i = 1:10 
      append!(b, i)
    end
  end

  function baz()
    @show b
  end
end

Foo.bar();
Foo.baz() # => 10-element Array{Any,1}: 1, 2, ...

This way, the REPL could evaluate in a module context wherein every variable is considered locally scoped, but we can still preserve the ability to perform escape analysis (in every other module).

3 Likes

Thanks for reaction! :slight_smile:

You are more experienced, could you tell me if there is way to make conditional compilation similar to C++'s #ifdef?

Could be @assert optimized out in future version if there are so subtle implication to variable scope?

Is it true? I am really confused as well! :stuck_out_tongue:

Maybe there can be a balance. The “if read before write then the user refers to the global variable” is probably safe enough (if you were writing that, maybe while debugging code, you’d be getting an error so you are really not losing much). In case this could still cause confusion, I imagine there is always the option to allow this but throw a warning (Read before write variable in a scoped block defaults to global: to avoid this warning add the keyword global). The new user can decide to ignore the warning (or learn from it) and the advanced user can copy paste the for loop from function body to REPL anyway as in this scenario the warning doesn’t matter so much (and add global in production code). The warning also has the advantage that the user will suspect that fancier tricks, like:

myvar = 0
for i = 1:10 
  myvar = i
  i == 5 && break
end

may require the keyword global to work as intended.

OTOH the “if we write on the variable but never read, then it is local” is IMO a bit extreme and here I completely agree that it risks getting too confusing (some @show statements during debugging could cause things to flip). I’m also afraid that this change is technically breaking. That is to say, if some users wrote:

myvar = 0
for i = 1:10 
    myvar = i
end
@assert myvar == 0

His / her code would break. I imagine nobody would write something like this on purpose, but I wonder whether semver allows this kind of changes in a minor release.

we showed above that scope definition of variables is decided in compile time before calling (it could be different in REPL though) doesn’t apply it here?

Yeah, I just sketched this up quickly, and you’re right that it would likely have to be some sort of new keyword or compiler directive. Maybe:

locally_scoped module Foo
# ...
end

But the idea is that, semantically, this would be the same as magical macro that appended global before every variable definition.

This is safe in static code. But it will be unsafe to add simple “read” line into code under this “solution”.

But I suppose you know: