I want to underline this. There are a lot of people here talking about this as if the ≤ 0.6 behavior was perfect and everyone wishes that we could go back to it. That’s certainly a valid point of view–after all, there’s a reason we chose the behavior the way we did originally—but it’s not one that is universally shared at all. There are many issues with the old behavior. There are several different desirable criteria for scoping rules that do not seem to all be satisfiable at the same time:
- simple and consistent rules that are easy to explain
- code meaning does not depend on mutable global state
- top-level behavior is similar to behavior in functions
- for loops have their own scope
The ≤ 0.6 behavior satisfied #3 and #4 but failed at #1 and #2. This wasn’t just a theoretical problem—there were lots of issues and complaints about the old behavior. Here are just a few:
- scoping issues, part 1 · Issue #423 · JuliaLang/julia · GitHub
- Scope and assignment order · Issue #4645 · JuliaLang/julia · GitHub
- Scope query in assignment / misunderstanding ? · Issue #6522 · JuliaLang/julia · GitHub
- improve documentation of soft vs. hard scope · Issue #9955 · JuliaLang/julia · GitHub
- Redirecting to Google Groups
- Function defined in global scope has "hard scope", function in any block has "semi-soft scope" · Issue #10559 · JuliaLang/julia · GitHub
- Problems with local scope · Issue #11696 · JuliaLang/julia · GitHub
Every time we explained why it worked the way it did, the explanation was met with skepticism and people telling us that “Julia’s scoping rules are far too complicated and very hard to teach.” (And often implicitly or explicitly everyone’s favorite existential threat: “This language will fail unless you change the scope rules.”) You could very easily run some code in the REPL, have it work the first time and then run it again and have it fail every subsequent time with the only recourse being to restart your REPL session. So although many people are now talking about the ≤ 0.6 behavior as if everything was perfect, it was not.
The 1.0 behavior on the other hand, is simple, consistent and easy to explain—it satisfies all of the criteria besides #3 beautifully. But apparently, people find it so unintuitive because of the failure to satisfy #3 that it’s a show-stopper for teaching. So yeah, existential threat territory again. (Can you see why we just love it when people make these existential threat kinds of comments?)
One of the other changes that could be made is to mess around with #4. Python does this: loops don’t introduce scope. That seems a bit extreme—everyone seems to like that in Julia you don’t have to worry about loop variable names clobbering things. And it matches comprehensions where it would be even worse if the “loop variables” leaked out of the comprehension. Perhaps something else could be done where loops introduce scope but it’s a different kind of scope where the loop variables are automatically local but the scope is porous to assignment inside of the loop body. However, I for one like being able to define a local variable in a loop body and not have it litter the rest of my function even though I don’t need or want it later. And this kind of thing would bring us back to back into the ≤ 0.6 “complex and hard to explain” territory, so that doesn’t seem ideal. Moreover, it’s complex and hard to explain everywhere not just in global scope. The problems we’re encountering are all about loops in global scope, so there’s something appropriate about the complexity only affecting loops in global scope rather than making local scope pay the price as well.
Which brings us to the behavior in Jeff’s new PR as proposed in this thread. It sacrifices a bit of #1, but in my opinion less so than the ≤ 0.6 behavior did: the rule here is that in a top-level for loop, if the first use of an unannotated variable is a read then it’s global, otherwise it’s a local. That’s a pretty simple, easy-to-explain rule. But people apparently just couldn’t wrap their heads around “if the a global by that name is already defined, then assignment updates the global, otherwise it creates a new local.” So who knows? It satisfies #2 perfectly: the meaning of code no longer depends on any global state—if you evaluate the same sequence of expressions in the REPL multiple times it means the same thing every time. It half satisfies #3 as follows… Expressions like
for i = 1:n
t += i
end
always work the same in functions and the REPL. Other expression like
for i = 1:n
t = 100
end
either work the same or don’t: in a function, whether t
is local to the loop body depends on whether t
exists as a local variable outside of the loop; in global scope it doesn’t matter whether a global t
exists or not, the t
is always local to the loop.
So the ≤ 0.6 behavior satisfied 2/4 of the desirable criteria (#3 + #4), the 1.0 behavior satisfies 3/4 of the desirable criteria (#1 + #2 + #4), and the new behavior satisfies a slightly different 3/4 of the desirable criteria (½#1 + #2 + ½#3 + #4). It still seems better to me on the whole than the ≤ 0.6 behavior, and if, as it seems many people do, it’s super important to be able to write for i = 1:n; t += i; end
in global scope and have it work, then it’s also better than the 1.0 behavior. But of course, it all depends on how you value the various criteria. Do you think simplicity is the most important thing? Then you probably think the 1.0 behavior is the best. Do you think that behavior in the REPL and function bodies matching as closely as possible is the most important thing? Then you probably think that ≤ 0.6 behavior is the best. Do you think that having statically predictable behavior that doesn’t depend on global state is the most important but also want accumulation in global scope to work? Then you probably think that the new PR behavior is the best. Personally, over time with more and more experience with language design, my appreciation for static predictability has increased markedly. But I also don’t want to answer questions from people being confused by accumulation not working in the REPL for the rest of my life. So the PR behavior is a pretty good choice from my perspective: it’s statically predictable and accumulation works. People can’t assign to globals from loops without using the global
keyword . Anyone who is going to try to argue that the old ≤ 0.6 behavior is ideal and anything else is just a stop-gap has to justify why criterion #3 is so incredibly much more important than all other considerations.