Personally, I appreciate @sadish-d for pointing out their opinions on Julia’s variable scope and declaration rules, as I think it’s nice to have people new to Julia provide fresh perspectives on certain “quirky” features of the language that frequent users might have taken for granted for so long.
Regarding this issue, I think the cause has more to do with the declaration of variables happening in compile time rather than runtime, which is before the if statement or return actually affects the evaluation of the program.
One thing that I found weird (and if someone can explain a good reason for that, I would be very thankful) is that to introduce a local variable, you don’t have to declare it location-wise before where it’s operated on (e.g., assignment, access):
julia> function foo1(x)
let
x=2
@show x
local x
end
x
end
foo1 (generic function with 1 method)
julia> let y=3
foo1(y)
end
x = 2
3
Even though the assignment of local x does not appear before x=2, it still shadows the outside x (the input argument of foo1) and makes x=2 only take effects inside the let block. For comparison, this is the code and result without local x:
julia> function foo1_2(x)
let
x=2
@show x
end
x
end
foo1_2 (generic function with 1 method)
julia> let y=3
foo1_2(y)
end
x = 2
2
It seems that the declaration of x happens “before” the runtime evaluation (when the compiler does not know the value of the variables). In other words, the declaration of local variables is part of the static structure of their parent expression, regardless of all the evaluations (and even their orders) within it during the runtime. Another example more similar to OP’s example is:
julia> function foo2(x)
let
if false
local x
end
x=2
@show x
end
x
end
foo2 (generic function with 1 method)
julia> let y=3
foo2(y)
end
x = 2
3
Even though
if false
local x
end
wasn’t executed during the runtime,local x successfully shadowed the input argument x.
@frankwswang, when confused with scope, I always ask myself this: Where was the variable declared?
In your first example, there are two x’s. One is implicitly declared in the function argument, another is explicitly declared with local x inside the let block.
In your second example, there’s only one x-- the one implicitly declared in the function argument.
And you’re right, it doesn’t matter if the declaration comes after the assignment, as in your first example.
@giordano , I think you’re right that this is not unique to Julia. I just tried this in python. f1() gives an error that “no binding for nonlocal x found”, meaning we’re telling it inside g() to assign to some outer x, but x is never declared outside of g. Makse sense.
def f1():
def g():
nonlocal x
x = 1
g()
assert x == 1
f1()
f2() implicitly declares x before g() tries to assign to it and it works fine.
def f2():
x = 0
def g():
nonlocal x
x = 1
g()
assert x == 1
f2()
f3() puts the same implicit delcaration of x behind a condition that never holds, and it still works the same as f2().
def f3():
if False:
x = 0
def g():
nonlocal x
x = 1
g()
assert x == 1
f3()
I think what I was trying to indicate is that because Julia allows mixing variable declaration and variable assignment/evaluation, which are “evaluated” during the different stages of program execution (compile time vs. runtime), it can have seemingly paradoxical behaviors in the edge cases you and others have shown.
These consequences are essentially the price to pay for favoring a more convenient and tolerant syntax, which is often the case in modern programming language design. This is in contrast to the old-fashioned way of explicitly separating variable declarations from the rest of the program body (like FORTRAN).
Changing which scope a variable originates depending on whether a branch gets executed would actually be unusual for languages. That’s not lexical scoping, but it’s not how dynamic scoping happens either.
Languages would do something between these extremes:
the x = 1 is considered a declaration for the outer let scope
it detects that x belongs to the outer let scope because of x = 1, but it complains that the earliest line with an assignment x = 0 does not exist in its home scope, so it errors and forces you to put an earlier local x declaration in the outer let scope to keep up the appearance of sequential execution.
As you’ve noticed, the appearance of sequential execution can be flubbed in languages that don’t demand explicit variable declarations and strictly introduce new scope in { }. Python does make us write declarations before assignments in the same scope (like def g():), but turns out it’s not strictly sequential-looking either. I offer this variant:
>>> def f4():
... def g():
... nonlocal x
... x = 1
... if False: x = 0 # comment out for f1's SyntaxError
... g()
... print(x)
...
>>> f4()
1
It is at least a bit more explicit than Julia because Python doesn’t let the inner function’s scope automatically reassign outer local variables and thus requires a nonlocal statement. However, it too looks into subsequent lines during parsing to figure out x’s scope and accepts even dead code without throwing an error.
That’s definitely one reason; part of the higher “productivity” is that we don’t have to write extra lines for “obvious” information. The other part is that our machines got powerful enough for better parsers. It’s easier to run into a int var; line and know immediately that’s the variable than to see var = 2 and look at the whole scope to infer which var it is.
But evidently parsers today still don’t remove “dead code” like if false ... end, and why should it do a compiler optimization? Dead code doesn’t mean unwritten code, and nobody has ever promised that any unexecuted code would have zero effect.
I believe the syntax is under consideration for “strict” julia as well, though as Stefan Karpinski also said in a comment in the issue you linked to, I would prefer := for initial declaration and = for reassignment. Reassignment happens more frequently and := is too much to type every time you reassign. Declaration also needs to stand out more and the := would help with that.
I think the syntax would help resolve a lot of confusion with scope, but just to be clear, I don’t think it means we would be able to conditionally declare variables with it. If I understand it correctly, it’s just syntax sugar for what would currently look like local x; x = 1.
I suspect scoping changes are the only effects of if false ... end blocks. For example, I don’t get complaints from @code_warntype if I write type-unstable code inside the dead block, and the LLVM IR printed by @code_llvm skips the dead code completely.