Is it appropriate to say a variable is a label or box?

I’m new to Julia, but not to programming, so to help me learn, I’m writing some notes and training material.

While reading Python Crash Course, the author describes a variable as:

Variables are often described as boxes you can store values in. This idea can be helpful the first few times you use a variable, but it isn’t an accurate way to describe how variables are interpreted in Python. It’s much better to think of variables as labels you can assign to values. You can also say a variable references a certain value.

This site also says the same:

If you imagine that variables are like boxes, you cannot make sense of assignment in Python. For an assignment, you must always read the right-hand side first: that’s where the object is created or retrieved. After that, the variable on the left is bound to the object, like a label stuck to it. Just forget about the boxes.

I’ve used Python for sometime now, and I’ve always thought variables are containers or boxes for values. This video also uses the box metaphor to describe variables.

In Julia, what is the correct way to interpret variables, labels or boxes ?

Labels.

7 Likes

Yeah, I also like the label analogy. I think it helps clarify that y = x doesn’t do anything other than associate the name :y with the current value of x.

The good news is that if you’re used to Python’s variable semantics, you should find that Julia’s behavior generally matches your expectations. In both Julia and python, doing y = x; y = z never changes x in any way-- it just attaches the :y label to a new value. Likewise in both Julia and Python, doing y[x] = z results in a function call that looks something like “set the x index of y to value z” and thus modifies the current value of y.

9 Likes

I like the “label” concept. Although the content of a memory cell can be changed by inserting a new value into it and in this way have a varying content, from a math point of view the memory cell can hold a constant, too. If pi is meant to hold \pi, it is in my view better to refer to pi as a label than as a variable.

Bare with me here, because I’m still learning, and I might be taking the whole box and label analogy a little too far.

As stated in the Julia documentation:

A variable, in Julia, is a name associated (or bound) to a value. It’s useful when you want to
store a value (that you obtained after some math, for example) for later use.

To me, this acknowledges that both label and box is applicable. You associate the value with a name, much like labeling an item, but you’re also storing the item for later use, just as you place an item in a box for later use.

As stated in quote from the site in my question, you have to read the code from right to left, so you create or retrieve the item and label it, but isn’t that just creating or retrieving a value and placing it into storage for later use, like a box? Why would reading it from right to left mean it’s a label?

One nice consequence of the label metaphor is that a value can have multiple labels, whereas it feels odd to put the same value in multiple boxes.

4 Likes

The trouble I have with the box analogy is this:

    mutable struct Me
        a::Int64
        b::Int64
    end

    c = Me(1, 2)
    d = c

I believe with the box analogy is that you would think you have 2 boxes, c and d, when in fact you only have 1 box both c and d point to the same “box” in this example. However if you continue and add:

    d = Me(3, 4)

Now c and d reference their own box.

Not sure if I just confused the issue or what…

3 Likes

Correct. Basically you have to realize that = may “create” a box, “update” the value in a box, or “change” the box the variable points to.

I guess the “update” you don’t need to really consider because whether it’s updates or creates from a programmer’s point of view doesn’t matter. However from a performance point of view update is a hell of a lot cheaper than create.

This I understand, because, correct if I’m wrong, when you say d=c you’re making a copy that points to the same reference (a shallow copy). However, d = Me(3, 4) is independent and not a copy of c . Even if the underlying values were the same, d = (1,2), is still independent of c.

The problem I have is this, consider the quote in my question:

If you imagine that variables are like boxes, you cannot make sense of assignment in Python. For an assignment, you must always read the right-hand side first: that’s where the object is created or retrieved. After that, the variable on the left is bound to the object, like a label stuck to it.

Why would reading it from right to left mean you are labeling the object, rather than storing it for later use, like a box?

I’m most likely taking the box and label analogy too far, I’ll admit I don’t understand the work being done “under the hood” so to speak. Obviously, Julia is not C# or Java, so I would have to adjust they way I think about variables.

I suppose I should point out before the compiler developers get involved that yes it does a shallow copy “in this case”. However if the structure is immutable i.e:

    struct Me
        a::Int64
        b::Int64
    end

    c = Me(1, 2)
    d = c

It would do a deep(ish) copy. You could have a mutable structure inside Me which is then shallowed copied…it gets confusing…

That I can’t answer.

I tend to think of a variable as a pointer or reference to a location in memory. If the '=` operation is creating or initializing the value then a new memory location “gets” the value and the variable “points” to that location.

If the ‘=’ is updating the value and it’s value is immutable then the memory “pointed” to by the variable is updated. If the value is mutable then the variable is updated to “point” to the new location.

Compiler optimizations may do other things, but from a developers point of view that is the behavior.

Exactly this. Perfect.

I guess I was taking the box and label analogy too far.

I tend to think of the different behaviors with immutable and mutable objects as being about the “identity” of the objects. Mutable objects have an address where they can be found as long as they live, so they have an “identity”. Two mutable objects are different, even if they have the same content, if they live at different memory addresses. Assignment of one mutable struct to different variables just labels the same instance at the same memory address again and again.

Immutable objects don’t have “identities” as they don’t get persistent memory addresses. Using an immutable object is kind of like using the “idea” of that object, because you as a programmer don’t know what the compiler actually does with your immutable structs. Two of them are indistinguishable from each other if they have the same values. So when assigning an immutable value labeled with a variable to a new variable it doesn’t make so much intuitive sense to think of this as giving the same object a second label, because there is no real concept of “same” for these objects in terms of identity, it’s only about the value. Assignment to a new variable might create a new object in memory or it might not, depending on the compiler, but that doesn’t affect how you reason about your program.

It’s a bit different if you have arrays of immutable structs, there you can definitely see different “instances” packed together in memory alongside each other, but once you extract them and do something with the singular values, you get the same idea again.

5 Likes

= does exactly the same thing whether the object on the right hand side is mutable or not. It is not helpful to think about them differently.

9 Likes

While analogies can be somewhat useful, I would suggest that you just focus on the semantics and not the implementation details.

Julia’s language semantics allow the compiler to make various choices about what to do “under the hood”, depending on various heuristics, to optimize performance. Eg immutable structs may or may not be copied, or even exist in RAM.

4 Likes

You can think of variables as living inside boxes, the box being the module scope.

For example, if you assign a new variable in module

module MyModule
var = "var label"
end

Then this var is boxed into the MyModule scope, so you must access it as MyModule.var unless you export or import it into another box (module scope).

So I would say that variables are labels in boxes.

As stated before, try to look past the analogy, and build up an independent concept for these things, the analogy only helps you get started if you do not yet have a any conception of these things yet.

1 Like

Ok, I arrived late, but let me do one more analogy:

Forget boxes. Boxes do not exist. Period.

When you create an object it simply exists. Semantically does not matter where it lives. (For performance reason it may matter but this is another topic.)

If you just create an object inside an expression and do not give it a label, then Julia knows it does not need to be stored anywhere and may discard it immediately after this use.

If you give it a label (this is called binding), then Julia know you may want to use that pile of data in the future, and keep it around somewhere.

You may want to give the object more than one label, the same pile of data may be referred by many names.

You may also pluck the labels from one pile of data and assign to other pile of data, none of this changes anything about the piles themselves. What was called current in the iteration before may be called old now.

If one pile of data lose all their labels, then Julia thinks you do not want anything anymore with that pile of data as you have no way of finding it anymore anyway (that pile is lost from your label-archiving system) and Julia then considers it garbage. When time for garbage collection comes, Julia sweeps all these unlabeled piles away.

structs are just a pile of labels. As they are a pile, they may be labeled themselves. As they have labels inside, they point to other piles (that may be labeled other ways than just the label inside the struct, they may, for an example, also have a label in your local scope, outside any struct).

18 Likes

As a new Julia user, I still see a lot of haze around its memory model. I wish there was a more formal description somewhere, which didn’t rely on other languages’ memory model as Functions · The Julia Language does. Coming from C++ background and not knowing details of Scheme, most Lisps, Python, Ruby and Perl makes discovering that structs are just a pile of labels not so straightforward.

You can find this in the developer documentation section of the manual, for users who really need an answer.
https://docs.julialang.org/en/v1/devdocs/object/

The layout of the object depends on its type.

I do agree with you that there is more that can be written about memory at a “high level” (without necessarily dropping straight down to the C representation). On the other hand, the overall takeaway is clearly that it’s really not as simple as “this is how structs are laid out”. Also, this memory model changes as new abilities (or complications) arise in the language. For example, julia 0.7 introduced isbits Union Optimizations · The Julia Language, which was a pretty big deal as far as memory layout is concerned. More features like this are (/will be) added as the whole compilation pipeline continues to improve, so a comprehensive overview of memory for people who don’t want to go through the dev-docs (for every release…) is probably premature.

3 Likes