An example of a breaking change in C++ was the change in the scope of a variable declared in the initialization statement of a for loop. In the pre-standard versions of C++, the scope of the variable extended to the end of the enclosing block, as though it were declared immediately before the for loop. During the development of the first C++ standard, C++98, this was changed so that the scope was only within the for loop itself. Compilers adapted by introducing options like -ffor-scope so that users could control the expected scope of the variable (for a period of time, when compiling with neither -ffor-scope nor -fno-for-scope , the GCC compiler used the old scope but warned about any code that relied on it).
The main objection to the current behavior is for interactive use, and this makes that worseâŚand in fact it would make interactive shells effectively unusable compared to competing languages.
FWIW I donât believe so. The snippet you quote is - what I think - the form ~everybody expects to work. Itâs a distraction if, e.g. in teaching, beginners must âjust be toldâ about global. In addition stackoverflow, blog and other code will be found by google which does not use global because of the SoftHardScope hack. Thus you first have to learn that in Julia for your quoted loop you need to write âglobalâ and then additionally learn that with âsome hackâ you can leave it out again (or that there is a command line flag to modify behaviour). Doesnât give a good impression imho.
For me it seems that the first (global) or especially this second (local) solution work great in general. I donât think it would be that bad if in some edge cases it is maybe not totally obvious at first sight without looking at the rules or without asking about interpretation, a beginner most likely would not run across it.
If there is the potential for confusion (or if the code is too long) it might be good practice to annotate with local or global anyway. This is the same as with complicated conditions where using (optional) brackets does clarify what âbelongs togetherâ.
One of the problems with the 0.6 scoping rules was that they were too context dependent, which made scope tricky to reason about. I fear that the proposed solution, clever as it is, is a step back in that direction. I think it would be much better to simply make variables within for and while loops always have outer scope unless explicitly marked as local. To me the concern about accidentally littering the namespace with in-loop variables is not that compelling; a decent linter could help with that.
Fair enough. Now, Iâm wondering if @jeff.bezansonâs solution from the other thread is still on the table at all, perhaps with the addition of local let as per @tim.holy? Looking at that thread, it seems that there were fewer objections, at least concrete ones with counterexamples.
Could we have a branch / branches with the proposed behavior(s) for people to play around with before a final decision is made?
Just to clarify a point that I feel bears repeating: when we talk about global variables, weâre not actually talking about âglobalâ variables in the traditional sense. Rather, this is more like a discussion about âmoduleâ-scoped vs âfunctionâ-scoped vs âblockâ-scoped variablesâŚnot sure if this says anything about a potential solution, but I feel like keeping this facet of the issue in mind is important in arriving at the ârightâ solution.
Possibly, but AFAIK in the documentation âglobal scopeâ is used consistently, so I donât think that this is a frequent source of misunderstanding.
I wasnât meaning to suggest it as a point of confusion but rather as another thing to consider when contemplating solutions.
To take @tim.holyâs example of shadowing first with an iteration variable, while doing so âgloballyâ would potentially be disastrous to other code in the same module, unless the module imported first, the damage would be limited to the moduleâs scope (as opposed to ruining the entire runtime, as would be the case if âglobalâ wasâŚglobal).
So, for example, if static analysis of a module determined that there were no uses of first, then shadowing first âgloballyâ in a for loop could be done without consequence. Iâm not suggesting that this is a solution, but maybe some creative thinking in this direction is warranted?
In other words, since âglobalâ isnât global, there is still the possibility of promoting things to global while still containing the side-effects of doing so.
You canât really do that analysis for the REPL. For example, quick quiz, if you have two statements, first = true and first([1,2,3]), what happens when you try them both? What if you swap the order in which you try them?
Answer (if youâre playing along, no peeking): whichever one you try first succeeds. Then the other one throws an error, but the error depends on the order you selected.
Consequence: since no amount of analysis can guess what the user will do next, one can argue itâs undesirable to assign more variables than necessary at the REPL. Damage can be repaired (e.g., first = Base.first), but having to do that manually is a pain. (âDrat, which package is foo defined in again?â) As a consequence I do a fair amount of playing around inside of let blocks, because I usually try to keep my REPL session going as long as is practical.
Iâm not trying to say it would be fatal if we have to give this up, Iâm just wanting to make sure the consequences of our choices are clear.
Precisely! Evaluation at the REPL occurs in a module scope, but it is, in effect, a never-ending module. Thus, certain whole-module analysis style solutions are unavailable to code at the REPL. One might reasonably ask, âWhy must REPL evaluation occur within a module scope?â Iâm not sure the Julia-specific answer, but for every other language Iâve used with a REPL, the answer is usually something along the lines of: âWell, what other scope are you going to use?â
But maybe thatâs the real problemâŚmaybe we shouldnât be trying to solve UX at the REPL with the same hammer as used to keep code within functions and modules clean. What if, for example, we introduced a script construct that had its own scoping semantics tailored toward REPL sessions and top-level script files?
Maybe within a script scope we track all variable shadowings so that, instead of having to do first = Base.first we could just do a pop_binding(first) (similar to the old workspace() callâŚwhich, for the record, I still miss ). Also, as a potential added benefit, adding a script scope for REPL sessions could probably done in v1.X without having to sacrifice semver (though, making it the default for top-level scripts might violate strict semver).
Cute example. I donât think Iâd call that a bug, unless youâd say that the cannot assign variable Base.first from module Main is itself a bug. Given that this is something we canât do, I think your example is validly dependent on details of the interpreter.
i donât have an opinion on how scopes should work, but I do believe that in order to accommodate new users and classroom use, it is important to have insightful error messages. For example, Matlab has an obscure scope rule stating that in breakpoint mode inside a function, you arenât allowed to introduce a new name in the case that the function itself has a nested function. See below for an example, and notice the detailed error message that links to a section of the documentation. So even if I had never heard about scopes and was new to Matlab, I would not be too put off by this occurrence.
>> type outer_function
function x = outer_function()
function y = inner_function()
y = 7;
end
a = inner_function();
keyboard
x = a;
end
>> x = outer_function()
K>> z = 12;
Attempt to add "z" to a static workspace.
See Variables in Nested and Anonymous Functions.
K>> dbcont
x =
7
The discussion has gotten pretty off-topic, but I still like this proposal and should mention a property that I find really appealing about it. To understand whether an unannotated variable is local or global in a top-level scope, you just need to scan the block from top to bottom and find the first usage of it in that scope: if itâs a read (including updating operators like +=) then the variable is global, if itâs a write then the variable is local. This is because the rule requires all paths to match, so finding any first usage is sufficient, including the first one from the top. Requiring all paths to agree is just to avoid situations where generally valid code transformations would otherwise cause the code to change behavior. For example, negating a condition and swapping the if and else branches should not change the behavior of code. Requiring all paths to match ensures that it doesnât.
I should also point out that it is breaking in a very specific and inoffensive way: code that would cause a parse error and require an explicit local or global annotation with this change, is almost certainly buggy. This is because the âotherwiseâ clause in the rule has to be one of these two cases:
the first use of x is a write on one path and a read on another;
x is only written.
In both cases x is currently local. In the first case, if the path where x is read first is taken, then there will be an undefined variable error. In the second case, there will be no error, but either the write is intended to modify a global x which it wonât actually do since x is local; or the assignment to x is left over from a time when there was a read somewhere in the scope that has since been deleted, in which case the write is vestigial and should be removed. Either way, itâs a bug and this change should reveal it. So itâs pretty hard to get too upset about the kind of âbreakingâ this change would causeâit would mostly result in fixed bugs.
This is important. I do not have a strong opinion on how scope needs to be done, but I do believe that the rule needs to be practical (I know how to check/use it in my code).
A clarifying question: suppose a user pastes (which is one of the use-cases frequently invoked when talking about this issue) the following code into the REPL:
for i in 1:n
j += i
end
and tries to run in, only to realize that j and n are undefined, which results in an error because the user failed to paste the line
j = 0
and initialize the n (which was supposed to be a function argument).
Is the global/local property of j and n fixed at this time (as global) by the attempted evaluation, despite the error? Or does it remain unknown, and will be established when the user runs the code with these variables initialized?
I just hope that whatever the final solution is, it does not require looking after a statement/expression in order to figure out what the meaning of a variable within the statement/expression.
One of the worst things IMO about pre v1.0 Julia was the âspooky action at a distanceâ, where adding a statement pages further down could affect the meaning of a variable.
I am so happy that the scope rules were changed, and Iâm thankful that Stefan & others are attempting to come up with a reasonable solution for the REPL, as long as that âspooky actionâ and weird corner cases are not reintroduced.
Nothing is affected by attempted evaluation. The whole point of moving away from the 0.6 design is that the meaning of code does not depend on global state of what global variables are defined or âexistâ. The code means something independent of whether j and n are defined, if it uses them as globals and they arenât defined, that will be an error, of course.
May I ask what is the rough timeline for this change? I am asking because I want to give a Julia tutorial in my school and if this change wonât take too long, I will wait.