I have been wondering if there is a reason for not making for not making the initialization of variables explicit. I think this is done very elegantly in golang with the “:=” operator. Like this one would know exactly when an entirely new variable shall be introduced (and one could also think about fixing a variable’s type at creation, maybe this would avoid some strange errors and enable further optimization).
Also you could avoid accidentally overwriting a variable’s value when you are not aware that a variable with the same name exists in the current scope.
So what did I miss in my considerations?
It might be good to be clear in your post as to if you are asking:
- Historically, when Julia was still very in flux, why was this decided not to be done / was this considered ?
- Why not add this feature to the language now?
These are subtly different questions with different answers, and those answers come from different (but overlapping) people.
The historical question is a kinda interesting chill discussion.
The feature request, is liable to be contentious.
Well, it’s kind of both. I was wondering if
- there was a reason for not including this feature in julia historically or if it was just that it was not thought about
- and if the reasons, if existent, for not implementing this feature are still valid today.
Also of course you would have to consider that this would be a breaking change in tons of julia code so I know its hard to argue for it.
To answer a very small part of this, this would not allow further optimization. Variable scopes are fully determined at compile time already, so this wouldn’t give llvm any new information.
Are you aware of
let x = ... ... end
? Sounds to me like this would address most points you brought up.
I was not aware of that, maybe you are right. However I do not know if this is feasible if you just want to create a single variable since it blows up code size. Also it adds another layer of nesting which may be undesirable. However, maybe it would make sense to define
for i=1:10 x:=i+1 ... end
as a shorthand for
for i=1:10 let x=i+1 ... end end
So a := would just add a let… end - block to the innermost block of code
This would not be a breaking change and not require too much work to be done…
We already have scope keyword, like global and local.
:= is not needed. Scope keywords are more flexible than
:= because they can also be used on function declaration. Also, there’s no more benefit using
:=, since there is no difference between
x := 1 and
local x = 1. Instead, you should propose a
const variable declaration syntax to avoid accidental overwrite.
Another problem is that,
:= is a syntax construction instead of a semantic one. It’s generally impossible to know which assignment statement actually initializing the variable, that is, the first one to assign the variable. Consider this multi-threading program:
parallel for i in 1:10 x := i end
So you don’t really know which
i initializes the
AFAIK this was an explicit design choice: having different syntax for introducing variables into a scope and assignment is something that programmers used to Python, R, Matlab, etc, may find inconvenient.
This came up in the (epic) scoping discussions: having separate syntax for the two would have solved a lot of scope problems trivially — cf
(let ((x x-val)) ...) in Lisp.
I actually don’t think that would be such an unreasonable proposal. I am assuming there could then only ever be one
:= per variable name in each scope and multiple ones would be a syntax error? The main question here is just whether this pattern is used often enough to warrant its own syntax. I think this does warrant an issue on GitHub at least to see what others think.
I would love to have
a := 3 as a shorthand for
a::Int = 3 (and have it work in global scope)!
So I think
:= could serve two purposes that often go together:
- Make sure a variable is not re-declared by mistake.
- Give the variable a fixed type inferred by the value on the right-hand side. This means you can fix the type without knowing/writing the type signature (which can be complex).
Plus, it’s nice for readability as the code clearly says “this is were the variable is first defined”.
I also wonder: would this not solve a lot of performance issues if it becomes the “default” declaration syntax (i.e. recommended when the flexibility of
= is not needed)? I mean, this would give
- a nice syntax to declare globals with good performance.
- a simple way to catch type instabilities: declare with
:=and you will get an error if you later assign a value that cannot be converted to the initial type.
- if we can make the parser/compiler a bit smarter this could also help with closure performance (see #15276).
AFAIK not requiring explicit declarations for locals was an explicit design choice for Julia. Cf eg
Having recently learned Go (where := is used very much), I was very pleased with the middle ground that they have in that kind of code between explicit scope and syntactic parsimony.
My own background is heavily influenced by an older tradition of Scheme (and Julia is very much of a Schemish flavor), as a result I find
let to be a natural expression that has very nice and clear scope. I would have expected that my reaction to := would be more negative. In fact, I found Go’s syntax to work really well.
On the other hand, I don’t think that the benefits of := would be fully recognized without some of the other aspects of Go, notably its very intentionally opinionated nature. For instance, the way that Go polices unused imports, unused variables and simple definition structure really work together for a more maintainable world. That level of strictness occasionally binds, but the solution is always clear enough that it is easy to fix. It would be hard, however, to convince somebody from a very dynamic world of that fact without actually having them experience it first hand.
That makes my head hurt.
Perl had a similar feature where you would have to write
my $var = 0;
upon the first use. It’s like
int var = 0 in C but without the type. The advantages are code readability (a visible guarantee that this is the first use) and catching of typos. If you typo’d
$car = 1 later on, it’d be a syntax error. AFAIK the
my keyword had no real effect other than causing an error if you didn’t use it.
Reverse compatibility was handled by having the user opt-in with
use strict. These sorts of checks were useful enough that all well written code would have
use strict on the first line.
Though you could claim having to declare variables is just a pain, I considered it to be an important code quality feature and missed it when switching to python.
This is certainly an interesting design direction that’s fun to think through. You would want something like this:
:= is required to initialize a variable,
= can only be used to update an existing variable. That would take some getting used to for people coming from Python, R or Matlab, who are unaccustomed to thinking about variable declaration much. Their question would likely be “what is the difference?” — I’ll leave that to someone else to try to answer in this hypothetical alternate universe (perhaps a different version of me).
It could, as you suggested, also declare the type of the variable to be the type of whatever the value that is assigned is. One of the reasons you don’t want variables to default to being type-const is that it’s fairly common for people who are using a REPL or doing other “fast and loose” work to reuse the same variable name like
x for something different — say a float in one case and a string in another. The
= distinction could help with that: if
x := occurs twice, the second one creates a new, separate
x variable. Any previous capture of
x still reference the old version of
x (which can no longer be modified except by a closure), any new capture references the new
x with a different type. That’s slightly confusing but I think that it would be fine and people would be able to enter multiple examples in the same REPL, which is the main thing they want. And we’d have type-stable globals, which is pretty nice. So this:
x := 0 x = 1 # fine, works x = "hello" # error, can't convert `String` to `Int` get() = x # capture this `x` set(v) = x = v # modify this `x` x := "hello" # works, new `x` get′() = x # capture new `x` get() # returns 1 get′() # returns "hello" x = "goodbye" get′() # returns "goodbye" get() # still returns 1 set(2) # only way we can change the old `x` now get() # returns 2
So that’s pretty nice. You can reuse common variable names but
get′ get to assume consistent types, which seems good. Thinking through some examples. First closures that do and don’t modify an outer local:
function f() x := 1 g() = x = 2 # modifies outer `x` h() = x := 3 # doesn't affect outer `x` end
Now for loops. The first version has a new iteration variable in each iteration:
for i := 1:10 # new `i` in each iteration end # `i` is undefined here
Next a version where the iteration variable is an outer local:
i := 0 for i = 1:10 # same `i` in all iterations end # `i` is 10 here
Unclear what one would do with the
in syntax that many people prefer. Do we just treat
in the same way as
:=? The assignment syntax for for loops seems strictly better in this design since it lets you choose between using an outer variable and having a new variable per loop iteration. Classic example of non-obvious interactions between different language features.
I can’t think of any other interesting example off the top of my head. Anyone think of any?
Could this type of syntax help for easier compiling and linking of standalone executables (slim executables without the julia runtime included) ?
However there I see the issue that casting should be somehow allowed, else
would throw an error which may be undesired.
I would only fix a type if the type is explicitly specified.
I was thinking that keeping the normal assignment behavior should continue, and if someone uses
:= then you enforce the types, that way you can get the benefits of having a special assignment operator without losing the flexibility.
I think then we would need a new type for variables initialized with
:=. Otherwise, how do we know if a particular assignment can change the type of the LHS? If the initialization happens in the same scope, you could conceivably tag the variable and restrict it there, but if that variable is passed into a function you’d need a way to distinguish variables initialized with
I think this is a bad idea: scoping is inherently a difficult and subtle concept which new (and old) users struggle with. But if
:= becomes possible it will appear all over the place, and require a careful explanation (that no one will actually understand) very early on the docs… so the more verbose version with
let sounds much better to me.