Why no := operator for initializing variables

This was a lot more necessary in Perl because if you do $x = 123 inside of a function body, it would assign to / create a global variable. So the only non-breaking option given that was to introduce a new syntax for assigning a new local variable, which was my $x = 123. The next evolution was to introduce our $x = 123 to explicitly ask to assign to the global. Once you have that, you can introduce use strict disallowing creating a new variable implicitly by assignment. But note that they didn’t get there very intentionally, it was a bit of a rambling path.

3 Likes

My recollection is that local $x to create a dynamically scoped local variable came very early. I know that my $x for lexically scoped variables came quite a bit later. I don’t recall if strict came before my. But yes, it was more useful in perl than it is for Julia (and probably python).
I agree that introducing lexically scoped variables a few years down the road was probably not planned. But, perl was originally designed with simplicity in mind; and not thinking about scope at all is simplest. local could be used by more advanced users. This is just guesses based on my recollection of perl culture.

No, the syntax for declaring variables has no impact on standalone executables.

Without the type-stable globals, this idea doesn’t seem very appealing to me. Then all it does is trade one way of declaring new variables (by x = when an outer local named x doesn’t exist) for another one (by x :=). I can guarantee that this would also confuse people and my guess is that it would be more confusing to more people than what we do now. What we do now has the property of mostly doing what people want without them having to think about the difference between declaration an assignment.

It’s just an interesting thought experiment — we can’t practically do this even in a 2.0 release. It would be a massively breaking change: we would need to disallow implicit declaration of locals by assignment, which breaks all Julia code everywhere. I also think that what we do now is good and less annoying for new users since they will be accustomed to just assigning to initialize and declare new locals if they are coming from Python, R, Matlab or Ruby (to mention just a few).

10 Likes

Reminds of an adventure game I played. It started with a joke. This person keep answering questions with another question. So this other person asked him, “why do you always answer a question with another question”, to which the person replied “Why not?”

1 Like

I should maybe add that the only change I think we should make to the local scope behavior as documented here is that this sentence:

  • in non-interactive contexts (files, eval), an ambiguity warning is printed and a new local is created;

should instead be changed to this:

  • in non-interactive contexts (files, eval), an ambiguity warning is printed and the existing global is assigned;

Of course, that makes the interactive and non-interactive behavior the same except for the ambiguity warning so I would probably change that section to this:

  1. Soft scope: If x is not already a local variable and all of the scope constructs containing the assignment are soft scopes (loops, try / catch blocks, or struct blocks), the behavior depends on whether the global variable x is defined:
    • If global x is undefined , a new local named x is created in the scope of the assignment;
    • If global x is defined , the global variable x is assigned.

The assignment in the last case (soft scope, existing global) is considered ambiguous: it is not clear from local syntactic context alone whether the assignment is meant to assign to the global or create a new local as it would if there were not already a global by that name. Therefore, when this code is evaluated in a file or other non-interactive context, an ambiguity warning is printed, prompting you to explicitly declare the assignment to be global or local (or rename the variable to avoid the name collision). In interactive contexts the global is assigned without any warning.

4 Likes

One mitigating factor that might make it just barely plausible to make a change like this is that it is a purely syntactic change, so it is highly amenable to automatic upgrading: you can parse the old syntax just generate the new syntax completely reliably. That said, there’s a lot of Julia code in the world at this point — and there will be even more by the time we get to 2.0 — and I really don’t think it’s a good idea to break all of it.

2 Likes

To answer some of the historical question: I consider a distinction like := vs. = to be a “fussy” language feature. They are almost the same thing, there is little natural intuition about what the difference might be, and yet you sometimes must use one, and sometimes the other. It makes matters worse that Python has now added := with a different meaning. To review, we would have

  1. Algol / Pascal: := is updating assignment
  2. Python: := is an expression yielding a value, = is a statement (both are just = in julia)
  3. Go / Julia (proposed): := declares a variable

Of these, I’m a bit sympathetic to Algol, since mutating a binding is arguably very different from establishing a mathematical equality.

Ok, so I need to randomly pepper my code with colons. Faced with something like that, I like to ask “could the compiler insert them for me?” And yes it can, so why bother me with it. The motivating example is

x = 0

function f()
    x = 1
    ...
end

I claim that you basically NEVER want the assignment to x inside the function to modify the global. The asymmetry here is massive. Thousands of functions look like this, and all of them except 1 or 2 rare cases want a new local variable. I absolutely do not want to tell somebody “aha, you forgot the colon, so this assigns a global instead!” It is very, very bad for the simplest, most obvious syntax to do something horrible that you never want to do.

I realize this is partly subjective. I dislike “fussy syntax”, others dislike “magic implicit stuff” more.

39 Likes

I would add

Mathematica := is a lazy version of = where the value changes when referenced variables do

This is actually a much more concrete difference between = and := and something I’ve found useful, but not something that I see working in Julia.

3 Likes

Actually we can already do this by a small macro:

macro stable(x)
  if x.head == :(=)
    return quote
      local tmp =$(esc(x.args[2]))
      local $(esc(x.args[1]))::(typeof(tmp)) = tmp
    end
   end
end

A small test:

# type is y is String
function f(x)
  @stable y = "212"
end
# multiple declaration is a syntax error
# ERROR: syntax: multiple type declarations for "y"
function f(x)
  @stable y = "212"
  @stable y = 1
end

Edit: There are three limitations about this approach:

  1. not work in global scope
  2. type conversion is still allowed
  3. introduce an additional variable tmp, which might make code_typed or code_warntype hard to read.
    1 is not a huge limitation unless type declaration is supported on global variable. 2 and 3 do require some special language supports.

Unfortunately it only works in the local scope, not global scope.

Yes. But global variable doesn’t even allow type declaration, it’s quite natural for this macro not to work in global scope (since global variables have no stable types…).

This is “just” a missing feature that someone needs to implement. Hard enough that it hasn’t been done yet but not difficult in principle.

3 Likes

Seems alright, but not quite as short or elegant as x := 1.

1 Like

I think it would be very nice to differentiate between variable declaration and updating with different operators as you describe, perhaps swapping ‘=’ with ‘:=’. The updating operator could then be a function with user-defined methods, such as operator= in C++… Any time you have a nontrivial struct which you want to update and don’t want to completely re-allocate, you currently need to use prefix notation (myupdate!(x, …)) instead of infix notation such as x = … or x := … to re-use the allocated memory in x, which is cumbersome.

This is a separate topic, but: if you add to it the ability to overload fused operations, you could express all the linear algebra operations with infix notation, e.g. something like (ignoring for the moment issues with aliased arguments)

function (:=,*,+)(A :: Matrix, B::Matrix, C::Matrix, D::Matrix)
# Implement A = B *C + D efficiently here, using existing storage in A
end

So this could be done with dense matrices, sparse matrices, tensors, other data structures, etc.

It seems you could just use ‘:=’ as the updating operator without breaking things, rather than redefining ‘=’. (Although I admit I like the other way better.)

Separating declaration from updating would also just make code easier to reason about, in my opinion.

1 Like

For arrays there is already A .= B*C .+ D doing what you want.

Yes, I’m aware of that. Two comments:

  1. In this particular case, will B*C cause a memory allocation?

  2. The more general notation will work with other data structures than simple arrays.

1 Like

Here is an example illustrating the memory allocation:


julia> function f()
          A = zeros(10,10)
          B = ones(10,10)
          C = 2 .* ones(10,10)
          D = 3 .* ones(10,10)
          @time A .= B*C .+ D
          return nothing
       end
f (generic function with 1 method)

julia> f()
  0.000005 seconds (1 allocation: 896 bytes)

In fact it allocated 896 bytes, whereas I was expecting 8*10^2 = 800 bytes.

I recently stumbled upon this. It looks related, although I don’t know how to use it.

Have a look at mul! (it overwrites the first argument):

julia> using LinearAlgebra, BenchmarkTools

julia> @btime $A .= mul!($D, $B, $C)
  793.755 ns (0 allocations: 0 bytes)

(maybe better add D to A first and then mul!(A, B, C) )

But I feel that should be discussed further elsewhere since it is not really related to :=.

If you want memory efficient functions, there is already the convention with a trailing !.
For example, how would you push a number to a vector with :=?
We have push! and pop! (and many more) defined for those operations which is much more expressive than :=.

Did you mean to suggest mul!(D, B, C, 1, 1)? mul!(D, B, C) doesn’t quite do the same as D = B * C + D. Also, this overwrites D in-place, which may or may not be desired.

2 Likes