Introduction
I prepared 2 variants of very simple, future scoping rules, just because I want to help towards an even better language that:
is dirt simple to learn, yet keeps the most serious hackers happy
( Why We Created Julia)
I have read all the scope-issue related topics on both discourse and github, every single reply, and thought for several days.
The main principles and constraints that I used are:
- Simplicity. Simple by default, and complex/advanced by choice (opt-in)
- In particular, avoid the issues beginner users encountered in interactive environments, as discussed in many topics
- Consistency. The fewer rules, the better. The strength of a rule is inversely proportional to the number of exceptions it has.
- Statically (compile time) resolvable scopes
- Same behavior of a scope construct within either global or local enclosing scope.
- Adding or removing access/read of a variable to/from a block should not affect the scope of the variable
Common parts to both variants
- Modules introduce global scopes, like in v1.0
- REPL and Script statements/expressions work in the global scope of the Main module , like in v1.0
- By block I will mean any block-like construct that can enclose some instructions, and so can potentially introduce a local scope.
In particular: function body, body of each cycle offor
orwhile
, body of each branch ofif..else...end
,try/catch
, comprehensions, struct, macro,begin..else
,(;)
chains (blocks), …(others?)- Yes, the branch of an
if... else...end
could introduce a local scope: it makes just as much sense as a cycle of awhile
(in fact, semantically should be equivalent), and even more sense because conditionals can return values, like functions.
Similarly forbegin..end
blocks, and(;)
chains.
- Yes, the branch of an
- In a local-scope block, an expression that needs the value of a variable, reads it from the most recent (above it) (re)definition of that variable, in this block, or, if none, then in the enclosing scope.
- The exception are functions, who’s definition in the code can occur after it is used.
I mostly focus on usual, “non-function” variables below.
- The exception are functions, who’s definition in the code can occur after it is used.
Variant A
This variant comes simply by extending @jeff.bezanson 's idea in Another possible solution to the global scope debacle to scope constructs inside a local scope as well.
- Functions introduce local scope, but every other block, by default, introduces no scope (regardless of whether inside a local or global scope)
- Except that the
for
’s iteration variable, and the variables defined right next tolet
(exlet a =2, b
) are always local to that block.
- Except that the
- An optional
local
keyword in front of a block name (likelocal while
etc) – forces it introduce a local scope as well- So this applies to
if..else...end
,begin..end
and(;)
as well, as I argued in “Common parts” above.
- So this applies to
- Inside no-scope blocks,
a = 1
(re-)binds the variable in the outer scope (be that local or global), while inside local-scope blocks – it (re) binds the variable in the local scope of the block. - Inside any block (no-scope or local scope), the target-scope of an assignment can be reversed from the default with either a
local
orouter
keyword (likelocal a=4
) respectively, and this will apply for all instructions after this assignment in the block.
By consequence, an assignment within a function (or other local scope) can only write to the outer scope (be it local or global) if prefixed with outer
, so this should make it safer.
Variant B
The missing ingredient
Introduce a different syntactical notation for defining vs changing a binding.
This idea has also been suggested by @Balance , here , here, and , IIUC, in this github reply by @jeff.bezanson – and probably by many others I’m not aware of.
For example: keep =
for defining new variables (to avoid too much change in syntax), and introduce =!
(like the convention for mutating functions) , or :=
or =:
(or other) for changing the binding of a previously defined variable (re-assignment).
I think when we program we already have this different intent in mind (“I’m going to define this to be…” , “let this be…”, vs “I’m going to change that var to…”, “let this change to…”), so it will not be any additional mental cost to actually convey that intent when we write the code.
It’s a low-hanging fruit that I think many dynamic or scripting languages don’t pick.
By introducing this distinction, it’s not only possible to get more consistent scoping rules, but will result in the compiler being able to catch many definition/re-definition errors, based on the rule that:
a binding can only be defined once in a given scope, and you cannot change a un-defined binding
like:
- After defining
foo = 3
, and attemptingfoo = 4
, this must mean that I forgot I had already definedfoo
. The compiler error will remind me to choose another name. - After only defining
foo = 3
, and attemptingfooo =! 4
, this must mean that I forgot that the correct name wasfoo
, or perhaps I made a typo. The compiler error will remind me to correct the name
The scope rules B
- Every block introduces local scope, with rules below.
- So this applies to
if..else...end
,begin..end
and(;)
as well, as I argued in “Common parts” above.
- So this applies to
a = 3
(definition) creates the binding within current scope (be it local or global) if thea
has not yet been defined in this scope, but gives error if it has been.a =! 4
(change / re-assignment) changes the previously defined binding:- if
a
has been defined in current scope, then it changes that. - else, if it has been defined in the outer scope, then it changes that binding.
- otherwise, it gives error.
- if
- If necessary,
outer
andlocal
could be introduced so that:outer a=3
within a block means creating the binding in the outer scope (be it global or local)local a =! 4
within a block ensures the changing the binding within the local scope
So a a =! 4
within a function body (or other block) may change the binding in the enclosing scope (local or global), but in my mind this apparent reduction in safety is compensated by:
- overall reduction in definition/redefinition errors as explained in “The missing ingredient” section
- the option in 4. above
- may have the compiler issue a warning for each re-definition that changes an outer binding as opposed to a local one.
Conclusion
There is value in simplicity.
Both A and B variants satisfy, IMO, all “principles and constraints” in the “Introduction” section.
How to achieve them ?: I’m not qualified to answer. But: where’s a will, there’s a way…
I believe the core developers and many of you have had/thought similar ideas – so I’m not claiming revolutionary ideas here. Also, there are probably problems with my rules, that I don’t realize yet.
Thanks for consideration. Any questions, remarks and corrections are welcome