Is it appropriate to say a variable is a label or box?

Exactly this. Perfect.

I guess I was taking the box and label analogy too far.

I tend to think of the different behaviors with immutable and mutable objects as being about the “identity” of the objects. Mutable objects have an address where they can be found as long as they live, so they have an “identity”. Two mutable objects are different, even if they have the same content, if they live at different memory addresses. Assignment of one mutable struct to different variables just labels the same instance at the same memory address again and again.

Immutable objects don’t have “identities” as they don’t get persistent memory addresses. Using an immutable object is kind of like using the “idea” of that object, because you as a programmer don’t know what the compiler actually does with your immutable structs. Two of them are indistinguishable from each other if they have the same values. So when assigning an immutable value labeled with a variable to a new variable it doesn’t make so much intuitive sense to think of this as giving the same object a second label, because there is no real concept of “same” for these objects in terms of identity, it’s only about the value. Assignment to a new variable might create a new object in memory or it might not, depending on the compiler, but that doesn’t affect how you reason about your program.

It’s a bit different if you have arrays of immutable structs, there you can definitely see different “instances” packed together in memory alongside each other, but once you extract them and do something with the singular values, you get the same idea again.

5 Likes

= does exactly the same thing whether the object on the right hand side is mutable or not. It is not helpful to think about them differently.

9 Likes

While analogies can be somewhat useful, I would suggest that you just focus on the semantics and not the implementation details.

Julia’s language semantics allow the compiler to make various choices about what to do “under the hood”, depending on various heuristics, to optimize performance. Eg immutable structs may or may not be copied, or even exist in RAM.

4 Likes

You can think of variables as living inside boxes, the box being the module scope.

For example, if you assign a new variable in module

module MyModule
var = "var label"
end

Then this var is boxed into the MyModule scope, so you must access it as MyModule.var unless you export or import it into another box (module scope).

So I would say that variables are labels in boxes.

As stated before, try to look past the analogy, and build up an independent concept for these things, the analogy only helps you get started if you do not yet have a any conception of these things yet.

1 Like

Ok, I arrived late, but let me do one more analogy:

Forget boxes. Boxes do not exist. Period.

When you create an object it simply exists. Semantically does not matter where it lives. (For performance reason it may matter but this is another topic.)

If you just create an object inside an expression and do not give it a label, then Julia knows it does not need to be stored anywhere and may discard it immediately after this use.

If you give it a label (this is called binding), then Julia know you may want to use that pile of data in the future, and keep it around somewhere.

You may want to give the object more than one label, the same pile of data may be referred by many names.

You may also pluck the labels from one pile of data and assign to other pile of data, none of this changes anything about the piles themselves. What was called current in the iteration before may be called old now.

If one pile of data lose all their labels, then Julia thinks you do not want anything anymore with that pile of data as you have no way of finding it anymore anyway (that pile is lost from your label-archiving system) and Julia then considers it garbage. When time for garbage collection comes, Julia sweeps all these unlabeled piles away.

structs are just a pile of labels. As they are a pile, they may be labeled themselves. As they have labels inside, they point to other piles (that may be labeled other ways than just the label inside the struct, they may, for an example, also have a label in your local scope, outside any struct).

11 Likes

As a new Julia user, I still see a lot of haze around its memory model. I wish there was a more formal description somewhere, which didn’t rely on other languages’ memory model as https://docs.julialang.org/en/v1/manual/functions/#Argument-Passing-Behavior does. Coming from C++ background and not knowing details of Scheme, most Lisps, Python, Ruby and Perl makes discovering that structs are just a pile of labels not so straightforward.

You can find this in the developer documentation section of the manual, for users who really need an answer.
https://docs.julialang.org/en/v1/devdocs/object/

The layout of the object depends on its type.

I do agree with you that there is more that can be written about memory at a “high level” (without necessarily dropping straight down to the C representation). On the other hand, the overall takeaway is clearly that it’s really not as simple as “this is how structs are laid out”. Also, this memory model changes as new abilities (or complications) arise in the language. For example, julia 0.7 introduced https://docs.julialang.org/en/v1/devdocs/isbitsunionarrays/, which was a pretty big deal as far as memory layout is concerned. More features like this are (/will be) added as the whole compilation pipeline continues to improve, so a comprehensive overview of memory for people who don’t want to go through the dev-docs (for every release…) is probably premature.

3 Likes

For what reason do you want to know more. Just for the sake of knowing or is there any actual code that you write that depends on knowing this?

3 Likes

As a user, arguably you almost never really have a need to know, unless you are interfacing with C.

Abstractions are a programmer’s best friend. Also, underspecifying a lot of low-level details in the exposed API leaves room for optimizations and refactoring. As pointed out multiple times in this topic, a value of some type may not even exist in memory as such, depending on compiler optimizations.

5 Likes

I was making a lot of basic mistakes in the first few weeks, trying to stretch C/C++ memory model to understand what is going on. By memory model, I don’t mean implementation details. I’m looking for something less fuzzy than label and box, whatever that means. Perhaps some sort of abstract machine tying up variables with bytes without referring to other languages.

And abstraction is what I’m looking for, not a byte layout, although I needed that too for C interface. I would like something a bit more formal than structs are just a pile of labels.

Okay, something like "a = expression makes a evaluate to the object expression is evaluated to within the scope a is valid" maybe? Do you have a particular piece of code that is confusing?

2 Likes

Here is an example I just run into, which I can’t parse using a = expression makes a evaluate to the object expression is evaluated to within the scope a is valid rule. I’m expecting a behavior similar to:

a = b = 42

where a and b evaluate to the same value, but in this code snippet:

using AbstractPlotting, GLMakie, AbstractPlotting.MakieLayout, Random

scene, layout = layoutscene(padding = 4, resolution = (1200, 1000));
ax1 = layout[1, 1] = LAxis(scene, title = "Lines")
lines!(ax1, randn(10)) # this works
lines!(layout[1, 1], randn(10)) # this doesn't work
display(scene)

I see different behavior than ax1 and layout[1, 1] evaluating to the same expression.

layout[1, 1] = ... is not a normal assignment, it is syntax for setindex!(layout, 1, 1) which is an overloadable function and can do absolutely whatever. Same with layout[1, 1] which is getindex(layout, 1, 1). So there is no deductions you can make syntaxwise from that (more than that ax1 will refer to whatever object layout[1,1] returned at that particular call).

2 Likes

But this:

b = zeros(3)
a = b[2] = 42

uses setindex and behaves as expected. Are you saying that library writer overloaded setindex, so it behaves in an unusual way? What is the way to find out whether the overload will behave like an array assignment or not?

I’m saying they could and since we are talking about what the syntax guarantees, that’s what is interesting.

1 Like

Some sleuthing reveals that, yes, layout has a custom setindex! https://github.com/jkrumbiegel/GridLayoutBase.jl/blob/6734d2ad247361261b343975fe36a7c6604de8fb/src/gridlayout.jl#L1033

It took a bit of searching with https://juliahub.com/ui/CodeSearch to find this out, but in general, the way you could do it simply at the repl is using the @which, or @less/@edit macros. E.g.

julia> @which a[1] = 2
setindex!(A::Array{T,N} where N, x, i1::Int64) where T in Base at array.jl:766
4 Likes

Just to be clear, my analogy is intentionally hazy.

My focus was: create a mental model using simple images that actually describe the semantics of Julia memory model. It intentionally left out some details that may be important for critical performance, but not to reason about the correctness of the code.

I think it gives some important intuitions:

  1. Names/labels are just a way to reference something that already exists independently of the label.
  2. As labels are external to the object itself, giving new labels or losing old ones never modify or create a new object. If you lose all labels the object may be removed from memory, but then you do not know this, because you cannot reference it anymore.

Also, I do not have to go back and change the analogy to include mutable and immutable structs. Everything I said still stands, but in mutable piles you can replace a label in the pile by another label, while immutable structs are piles in which the labels are superglued when created, so you cannot replace a label without throwing away the whole pile and creating a new one.

7 Likes

I can tell you why I implemented it like this if that helps. The problem is that a grid layout seems to be a matrix-like container at first glance. But you can place objects across multiple rows and columns and you can place multiple objects at overlapping locations. That means there is no clear mapping from layout[i, j] to an object at that position in the layout.

The getindex syntax returns a GridPosition object describing the position you queried instead, because such an object can be a useful thing to feed to other functions, rather than directly returning an array of objects that match the queried position in the layout.

That’s why ax1!= layout[1, 1]. The syntax is just too useful not to use it, even though the underlying logic is different than that of normal matrices.

You can retrieve a vector of objects at a layout position with contents(layout[rows, columns]).

4 Likes