Is it appropriate to say a variable is a label or box?

You can think of variables as living inside boxes, the box being the module scope.

For example, if you assign a new variable in module

module MyModule
var = "var label"
end

Then this var is boxed into the MyModule scope, so you must access it as MyModule.var unless you export or import it into another box (module scope).

So I would say that variables are labels in boxes.

As stated before, try to look past the analogy, and build up an independent concept for these things, the analogy only helps you get started if you do not yet have a any conception of these things yet.

1 Like

Ok, I arrived late, but let me do one more analogy:

Forget boxes. Boxes do not exist. Period.

When you create an object it simply exists. Semantically does not matter where it lives. (For performance reason it may matter but this is another topic.)

If you just create an object inside an expression and do not give it a label, then Julia knows it does not need to be stored anywhere and may discard it immediately after this use.

If you give it a label (this is called binding), then Julia know you may want to use that pile of data in the future, and keep it around somewhere.

You may want to give the object more than one label, the same pile of data may be referred by many names.

You may also pluck the labels from one pile of data and assign to other pile of data, none of this changes anything about the piles themselves. What was called current in the iteration before may be called old now.

If one pile of data lose all their labels, then Julia thinks you do not want anything anymore with that pile of data as you have no way of finding it anymore anyway (that pile is lost from your label-archiving system) and Julia then considers it garbage. When time for garbage collection comes, Julia sweeps all these unlabeled piles away.

structs are just a pile of labels. As they are a pile, they may be labeled themselves. As they have labels inside, they point to other piles (that may be labeled other ways than just the label inside the struct, they may, for an example, also have a label in your local scope, outside any struct).

18 Likes

As a new Julia user, I still see a lot of haze around its memory model. I wish there was a more formal description somewhere, which didn’t rely on other languages’ memory model as Functions · The Julia Language does. Coming from C++ background and not knowing details of Scheme, most Lisps, Python, Ruby and Perl makes discovering that structs are just a pile of labels not so straightforward.

You can find this in the developer documentation section of the manual, for users who really need an answer.
https://docs.julialang.org/en/v1/devdocs/object/

The layout of the object depends on its type.

I do agree with you that there is more that can be written about memory at a “high level” (without necessarily dropping straight down to the C representation). On the other hand, the overall takeaway is clearly that it’s really not as simple as “this is how structs are laid out”. Also, this memory model changes as new abilities (or complications) arise in the language. For example, julia 0.7 introduced isbits Union Optimizations · The Julia Language, which was a pretty big deal as far as memory layout is concerned. More features like this are (/will be) added as the whole compilation pipeline continues to improve, so a comprehensive overview of memory for people who don’t want to go through the dev-docs (for every release…) is probably premature.

3 Likes

For what reason do you want to know more. Just for the sake of knowing or is there any actual code that you write that depends on knowing this?

3 Likes

As a user, arguably you almost never really have a need to know, unless you are interfacing with C.

Abstractions are a programmer’s best friend. Also, underspecifying a lot of low-level details in the exposed API leaves room for optimizations and refactoring. As pointed out multiple times in this topic, a value of some type may not even exist in memory as such, depending on compiler optimizations.

5 Likes

I was making a lot of basic mistakes in the first few weeks, trying to stretch C/C++ memory model to understand what is going on. By memory model, I don’t mean implementation details. I’m looking for something less fuzzy than label and box, whatever that means. Perhaps some sort of abstract machine tying up variables with bytes without referring to other languages.

1 Like

And abstraction is what I’m looking for, not a byte layout, although I needed that too for C interface. I would like something a bit more formal than structs are just a pile of labels.

Okay, something like “a = expression makes a evaluate to the object expression is evaluated to within the scope a is valid” maybe? Do you have a particular piece of code that is confusing?

2 Likes

Here is an example I just run into, which I can’t parse using a = expression makes a evaluate to the object expression is evaluated to within the scope a is valid rule. I’m expecting a behavior similar to:

a = b = 42

where a and b evaluate to the same value, but in this code snippet:

using AbstractPlotting, GLMakie, AbstractPlotting.MakieLayout, Random

scene, layout = layoutscene(padding = 4, resolution = (1200, 1000));
ax1 = layout[1, 1] = LAxis(scene, title = "Lines")
lines!(ax1, randn(10)) # this works
lines!(layout[1, 1], randn(10)) # this doesn't work
display(scene)

I see different behavior than ax1 and layout[1, 1] evaluating to the same expression.

layout[1, 1] = ... is not a normal assignment, it is syntax for setindex!(layout, 1, 1) which is an overloadable function and can do absolutely whatever. Same with layout[1, 1] which is getindex(layout, 1, 1). So there is no deductions you can make syntaxwise from that (more than that ax1 will refer to whatever object layout[1,1] returned at that particular call).

2 Likes

But this:

b = zeros(3)
a = b[2] = 42

uses setindex and behaves as expected. Are you saying that library writer overloaded setindex, so it behaves in an unusual way? What is the way to find out whether the overload will behave like an array assignment or not?

I’m saying they could and since we are talking about what the syntax guarantees, that’s what is interesting.

1 Like

Some sleuthing reveals that, yes, layout has a custom setindex! https://github.com/jkrumbiegel/GridLayoutBase.jl/blob/6734d2ad247361261b343975fe36a7c6604de8fb/src/gridlayout.jl#L1033

It took a bit of searching with JuliaHub to find this out, but in general, the way you could do it simply at the repl is using the @which, or @less/@edit macros. E.g.

julia> @which a[1] = 2
setindex!(A::Array{T,N} where N, x, i1::Int64) where T in Base at array.jl:766
4 Likes

Just to be clear, my analogy is intentionally hazy.

My focus was: create a mental model using simple images that actually describe the semantics of Julia memory model. It intentionally left out some details that may be important for critical performance, but not to reason about the correctness of the code.

I think it gives some important intuitions:

  1. Names/labels are just a way to reference something that already exists independently of the label.
  2. As labels are external to the object itself, giving new labels or losing old ones never modify or create a new object. If you lose all labels the object may be removed from memory, but then you do not know this, because you cannot reference it anymore.

Also, I do not have to go back and change the analogy to include mutable and immutable structs. Everything I said still stands, but in mutable piles you can replace a label in the pile by another label, while immutable structs are piles in which the labels are superglued when created, so you cannot replace a label without throwing away the whole pile and creating a new one.

7 Likes

I can tell you why I implemented it like this if that helps. The problem is that a grid layout seems to be a matrix-like container at first glance. But you can place objects across multiple rows and columns and you can place multiple objects at overlapping locations. That means there is no clear mapping from layout[i, j] to an object at that position in the layout.

The getindex syntax returns a GridPosition object describing the position you queried instead, because such an object can be a useful thing to feed to other functions, rather than directly returning an array of objects that match the queried position in the layout.

That’s why ax1!= layout[1, 1]. The syntax is just too useful not to use it, even though the underlying logic is different than that of normal matrices.

You can retrieve a vector of objects at a layout position with contents(layout[rows, columns]).

4 Likes

Sorry for a delay, but this is another puzzle I don’t understand given vague definitions of fundamental concepts: I thought I understood "pass-by-sharing" in Julia until I found this

You are not passing anything to the function, you have captured (closed over) a local variable. Look up the docs for closures for that.

A closure is simply a callable object with field names corresponding to captured variables explains existence of fields, but makes validity and semantics of captured variable outside its scope puzzling. Is there more documentation of closures than in Julia Functions · The Julia Language ?

1 Like

You can still think a variable is a Box even in Python. Why ?
I’ll apply the words: “memory address” and “id” as interchangebly.
Everything in python is an object (Except keywords). Object is also a box(id, value, type) without a name attached to it. If you want to create a name for that box(object), you has to define variable. That variable will contain the memory address(id) of that object, or another word it’s a reference(id) to a specific object. It might be different from another programming language, variable is a box labeled with a name that contains a value(Not memory address).

In Python, when write:
[1, 3, 4]
{“key”: “value”}
set()
4
“string”

id(“string”)
id( {“key”: “value”})
“”"
All the expressions above have id
“”"

Box = [1, 3, 4] # Box is a box that contains memory address of [1, 3, 4] object (not a sequence of value).

In python, generally, all the values are all objects, since there’re no names attached to the objects(things). Python really used the values to represent the objects.
But two objects are identical based on id not values. Because there’re mutable and immutable objects.

Always keep in mind, keep in mind that! ID is a priority of everything in Python. When you try to access anything, Python always uses the id of that particular thing, in the first step.