If this soft global scope hack means that I won’t be able to copy some code from a test file or an examples file and paste it into the REPL or IJulia to reproduce the behavior, and vice versa, then I find this a very weird divergence of language semantics. I actually prefer the current behavior over such divergence, if votes count, and I prefer the 0.6 behavior over both.
I am not sure if you know it but currently you could not copy from function body to global scope in REPL and expect same behavior.
I did not mean to copy from a function body, but to/from global scope in a test file.
It may avoid some confusion if the compiler emits a warning if the following two conditions are met:
- A local variable is assigned, but never read before it falls out of scope.
- A global variable with the same name exists.
Condition 1 would be checked statically when the block is compiled, and condition 2 would be checked at run-time.
Consider the example:
for x in 1:5
if x == 5
found = true
end
end
Here, the compiler would see found
as a local variable that is written but not read, and emits code that checks for the existence of a global variable and if so, gives a warning: “Did you mean global found = true
?”
(If the global found
is a Bool
then the warning could suggest found |= true
as an alternative.)
There’s still the issue that a @show found
that the user puts in for debugging purposes would make the warning go away. I don’t think there’s a way to avoid this.
On a related note: How about allowing postfix if
following an assignment, and making
found = true if x == 5
equivalent to
found = if x == 5
true
else
found
end
That would create a code-path that reads before writing, which is what we want in these cases.
I appreciate the effort to find a solution to a difficult problem — thanks for thinking about this!
However, I am not sure I like the fact that to reason about the global
/local
status of x
, I have to look at surrounding code, and all the branches. IMO doing this kind of reasoning is not something that humans are particularly good at (as opposed to compilers).
I find it problematic that if I comment out some lines (which I do sometimes for debugging or WIP code), x
could flip back and forth between local and global. While I recognize that it can be a pedagogical challenge in some contexts, I find the status quo of v1.0
easier to reason about.
It seems the options are:
- local scope + error messages so beginners at least know whats going on
- local scope + SoftGlobalScope (or some other tooling) by default in the REPL
- global scope
- global scope +
local let
(or equivalent) to easily create locally scoped blocks - DWIM scope
abandon Julia and go back to Python
A subset of users/devs will be unhappy with the final decision
I’d also want better error messages in this case (in files).
This would also be my personal preference. It’s solid, non-breaking, and we don’t have to change (again) all stack overflow and discourse post answers related to this scoping issue. It would just subtly improve the situation.
Sometime you want to break from nested for. What about this? This is current behavior:
julia> broken = false
for i in 1:2
# broken = false
for j in 1:2
if true broken = true;break;end
end
println("inner $broken")
if broken break; end
end
println(broken)
inner false
inner false
false
What would we expect here?
We could look at “write only”/“no use” case:
found = false
for i in 1:2
found = true
@debug_test found == true # it is used (in read mode) here if macro is expanded!
end
Debug version will be different from no-debug version!!
I think rules should be simple and easy, this one is not! If you really feel that you need this rule, remove it for Julia 2.0 or before if possible.
It is quite a mess in current behavior too. Look at Schroedinger’s cat:
julia> dead = false
for j in 1:1
if rand()>0.5 dead = true;end # cat is unlucky :(
print("is shroedinger's cat dead? $dead")
end
is shroedinger's cat dead? true
If we avoid dead cat is happy:
julia> dead = false
for j in 1:1
# if rand()>0.5 dead = true;end
print("is shroedinger's cat dead? $dead") # cat is lucky! :)
end
is shroedinger's cat dead? false
But in case of cat’s luckiness experiment is broken:
julia> dead = false
for j in 1:1
if rand()>0.5 dead = true;end
print("is shroedinger's cat dead? $dead") # coder is not lucky :(
end
ERROR: UndefVarError: dead not defined
It seems that assignement (which not happened!) made dead
variable local and undefined.
EDIT:
Could this be optimized out in future?
dead = false
for j in 1:1
if VERSION<v"1.0" dead = true;end # I want to check conditional programming here
print("is shroedinger's cat dead? $dead")
end
ERROR: UndefVarError: dead not defined
I don’t think many people want to leave things as they are.
Both in SoftGlobalScope and in a function (any local scope) your example works as expected. It would also work if everything would default to global. So this is most likely going to be fixed, (almost) independent of what change will be made.
How it could be if one proposal want to check context and context is quite questionable as could be seen from my tests too?
What does really mean that variable is not used? Or is used “write only”?
Maybe I am wrong, could you explain it more please?
Error messages: https://discourse.julialang.org/t/improving-error-messages-for-the-scoping-problem/16209.
Remember, the local/global decision for variables does not happen at runtime, it happens at compile-time (I think early in lowering?). I think Stefan’s solution is to follow unconditional @goto
and always follow both branches (even for literal if false
). So it would fix the following example:
julia> dead=true;
julia> let
@show dead
@goto skip
dead = false
@label skip
end
ERROR: UndefVarError: dead not defined
but not
julia> dead=true;
julia> let
@show dead
if true @goto skip end
dead = false
@label skip
end
ERROR: UndefVarError: dead not defined
The rule would be: Follow all pathes (without evaluating known conditionals). If there exists a write before read path, then the variable defaults to local. Otherwise, it defaults to global.
As a side note: The while
gets evaluated in the outer scope, not the inner scope. That is probably confusing for some people as well:
julia> m=4; n=2; i=1; while i>0
i = n
@show i, n
global m -= 1
global n -= 1
@show m,n
m>0 || break
end; @show m, n, i
(i, n) = (2, 2)
(m, n) = (3, 1)
(i, n) = (1, 1)
(m, n) = (2, 0)
(i, n) = (0, 0)
(m, n) = (1, -1)
(i, n) = (-1, -1)
(m, n) = (0, -2)
(m, n, i) = (0, -2, 1)
So, regardless of this scoping, a minimally invasive (very non-optimizing) @code_semilowered
that produces valid julia source code with only let
blocks and @goto
would be nice for that. It would also teach people about the iterator interface.
This proposed solution bears a striking resemblance to escape analysis, which is a tricky beast but also key to some seriously powerful compiler optimizations. It is also, notoriously, the one optimization that Java can still not perform (well). I bring this up because I think it is worth considering the fix to this “bug” in the larger context of escape analysis.
Java has problems with escape analysis not only because it is a difficult optimization to perform, but also because the language was not designed with it in mind. With Julia, we have the opportunity to evolve the language in a way that would facilitate escape analysis.
I think the crux of the scope “bug” is the desire to create strongly bounded scopes. We want this because it simplifies escape analysis. If we state that any variable created within a for
loop, or within a function, falls out of scope at the conclusion of the loop or function body, unless returned, then we only need follow the path of explicit returns to perform escape analysis. However, if some value within one of these scopes is assigned to a global variable then we must consider multiple escape routes. Consider, for example:
b = []
function foo()
global b
for i = 1:10
append!(b, i)
end
end
foo()
There are, in this function, 10 values that have escaped the function scope. Still, because we must specify global b
, analysis is relatively straightforward. The more complicated the rules become for determining when a variable might escape a scope, the more difficult it becomes to perform escape analysis.
The REPL throws a monkey-wrench into all of this, as it is essentially a never-ending function call. Nothing can escape the REPL, so we would like to relax some of the constraints around escape analysis in the name of “user experience”. The problem, of course, is that the REPL is not a function call.
In short, I am not in favor of this proposed solution because of how it potentially complicates escape analysis. I do think, however, that it highlights one potential path toward a more general solution. What if, instead of tweaking the rules for how variables might, or might not, escape from an inner scope, we allowed for outer scopes to explicitly opt out of variable escaping? In other words, what if you could do the following:
module Foo
locally_scoped() # => this call alters the scoping rules of the module
b = []
function bar()
for i = 1:10
append!(b, i)
end
end
function baz()
@show b
end
end
Foo.bar();
Foo.baz() # => 10-element Array{Any,1}: 1, 2, ...
This way, the REPL could evaluate in a module context wherein every variable is considered locally scoped, but we can still preserve the ability to perform escape analysis (in every other module).
Thanks for reaction!
You are more experienced, could you tell me if there is way to make conditional compilation similar to C++'s #ifdef?
Could be @assert optimized out in future version if there are so subtle implication to variable scope?
Is it true? I am really confused as well!
Maybe there can be a balance. The “if read before write then the user refers to the global variable” is probably safe enough (if you were writing that, maybe while debugging code, you’d be getting an error so you are really not losing much). In case this could still cause confusion, I imagine there is always the option to allow this but throw a warning (Read before write variable in a scoped block defaults to global: to avoid this warning add the keyword global
). The new user can decide to ignore the warning (or learn from it) and the advanced user can copy paste the for loop from function body to REPL anyway as in this scenario the warning doesn’t matter so much (and add global
in production code). The warning also has the advantage that the user will suspect that fancier tricks, like:
myvar = 0
for i = 1:10
myvar = i
i == 5 && break
end
may require the keyword global
to work as intended.
OTOH the “if we write on the variable but never read, then it is local” is IMO a bit extreme and here I completely agree that it risks getting too confusing (some @show
statements during debugging could cause things to flip). I’m also afraid that this change is technically breaking. That is to say, if some users wrote:
myvar = 0
for i = 1:10
myvar = i
end
@assert myvar == 0
His / her code would break. I imagine nobody would write something like this on purpose, but I wonder whether semver allows this kind of changes in a minor release.
we showed above that scope definition of variables is decided in compile time before calling (it could be different in REPL though) doesn’t apply it here?
Yeah, I just sketched this up quickly, and you’re right that it would likely have to be some sort of new keyword or compiler directive. Maybe:
locally_scoped module Foo
# ...
end
But the idea is that, semantically, this would be the same as magical macro that appended global
before every variable definition.
This is safe in static code. But it will be unsafe to add simple “read” line into code under this “solution”.
But I suppose you know: