Understanding how variables are stored

After my epic boxes or labels question, I decided to put those analogies aside for the moment and focus on the documentation.

On the memory layout page, it states:

The jl_value_t struct is the name for a block of memory owned by the Julia Garbage Collector, representing the data associated with a Julia object in memory.

Questions:

  1. Are Strings, Integers and Floating-Point Numbers objects?
  2. When I create and initialize a variable, for example weapon_damage = 50, is a struct is created in memory that stores the value ( jl_value_t), and type ( jl_typetag_t) ?
  3. What about the variable name associated with the value, where is that stored?

Consider this answer from Software Eng Stack Exchange that uses both a box and a label to describe what’s happening in memory when you declare and object. What I’m looking for is what happens in memory when you bind a variable name to a value.

I think you linked my post two times instead of linking to the memory layout page the second time.

Note the same section says:

A value may be stored “unboxed” in many circumstances (just the data, without the metadata, and possibly not even stored but just kept in registers), […]

Seem to me that the section talks about objects allocated in the heap and for which the garbage collector has to keep track. I am not sure, but I think even if the object is heap-allocated the compiler may yet prove it does not escape the local scope and just de-allocate it at the end of the scope instead of passing it to the garbage collector, but I may be wrong.

My answers:

  1. Yes, they are often referred as objects or values, but this does not mean obligatorily they are heap allocated.
  2. It depends. Do you mean when your method is being compiled or executed? When executed, if weapon_damage is just a binding in local scope it is probable it is not even stack-allocated and exist just on the registers.
  3. It depends. Do you mean when your method is being compiled or executed? I do not believe it is stored after the method was compiled in most cases. In your weapon_damage case it probably is not.

Terribly sorry about that, I corrected it.

The part I have trouble understanding is when you create a variable, for example weapon_damage = 20 what happens “underneath the hood”, is the variable name stored in a space somewhere in memory, with the value 20 associated with it?

It will often not exist at all. Take this example:

function hits_to_kill(hp = 100)
    weapon_damage = 20
    cld(hp, weapon_damage)
end

Yields:

julia> @code_typed hits_to_kill()
CodeInfo(
1 ─     return 5
) => Int64

But what if it can’t find the answer at compile time?

julia> @code_typed hits_to_kill(100)
CodeInfo(
1 ─ %1  = π (20, Core.Compiler.Const(20, false))
│   %2  = Base.checked_sdiv_int(hp, %1)::Int64
│   %3  = Base.slt_int(0, hp)::Bool
│   %4  = (%3 === true)::Bool
│   %5  = Base.mul_int(%2, %1)::Int64
│   %6  = (%5 === hp)::Bool
│   %7  = Base.not_int(%6)::Bool
│   %8  = Base.and_int(%4, %7)::Bool
│   %9  = Core.zext_int(Core.Int64, %8)::Int64
│   %10 = Core.and_int(%9, 1)::Int64
│   %11 = Base.add_int(%2, %10)::Int64
└──       return %11
) => Int64

We still have a 20. What if we compile further?

#julia> @code_native debuginfo=:none syntax=:intel hits_to_kill(100)
        .text
        movabs  rcx, 7378697629483820647
        mov     rax, rdi
        imul    rcx
        mov     rax, rdx
        shr     rax, 63
        sar     rdx, 3
        add     rdx, rax
        test    rdi, rdi
        setg    al
        lea     rcx, [4*rdx]
        lea     rcx, [rcx + 4*rcx]
        cmp     rcx, rdi
        setne   cl
        and     cl, al
        movzx   eax, cl
        add     rax, rdx
        ret
        nop

Now the 20 and division have disappeared, replaced with an equivalent series of operations involving multiplication and bitshifts that are faster than division.
Of course, another simpler example where we see a difference:

julia> @code_typed hits_to_kill(100.0)
CodeInfo(
1 ─ %1 = Base.div_float(hp, 20.0)::Float64
│   %2 = Base.ceil_llvm(%1)::Float64
│   %3 = Base.mul_float(20.0, %2)::Float64
│   %4 = Base.sub_float(hp, %3)::Float64
│   %5 = Base.sub_float(hp, %4)::Float64
│   %6 = Base.div_float(%5, 20.0)::Float64
│   %7 = Base.rint_llvm(%6)::Float64
└──      return %7
) => Float64

Where now we have 20.0 instead of 20.

For understanding what answers code will produce, you need to know the semantics of the language.
But, internally “under the hood”, the compiler has some freedom to make it do very different things from what you wrote, so long as it produces an identical answer.

4 Likes

is the variable name stored in a space somewhere in memory, with the value 20 associated with it?

Probably not.

For example, if you do weapon_damage = 20, never change the value, just use it for some computation, the compiled method will probably replicate the literal 20 value in the places you would use weapon_damage.

If weapon_damage is a binding that changes value many times inside the scope, then it may be represented by a stack-allocated memory block, but you do not worry about GC, and the name you used is probably not stored but replaced by an address in the stack.

In other cases, such the binding may store different types of heap-allocated objects, then maybe that struct is used. However, note that even that struct does not seem to store any “name” of a variable (at least not the one you have given to it in your code).

As mentioned multiple times in answer to your previous question, this is deliberately unspecified in a lot circumstances, allowing various optimizations.

That 20::Int might be in a register, or constant propagated, or allocated in a box (eg for a global). The answers may change for the same method compiled with different argument types, and from one Julia version to another.

Again, this information is mostly relevant if you want to work on Julia’s internals. Otherwise, if you are just learning Julia, this is a distraction.