Mutable vs immutable struct: modifying an array field

I am confused about the mutable vs immutable struct documentation.

For composite types, this means that the identity of the values of its fields will never change. When the fields are bits types, that means their bits will never change, for fields whose values are mutable types like arrays, that means the fields will always refer to the same mutable value even though that mutable valueβs content may itself be modified.

I think I have a basic question, but it is not quite answered here: question1 and question2. It concerns when I should use mutable vs immutable structs.

In my application, I am trying to create a struct containing all fields of interest (that have already been allocated), and then create sub-structs (referring to the already allocated fields) that do particular things to the already allocated elements.

For example, I will make a struct with three fields `a`, `b`, and `c` for already allocated `a,b,c` and then do specific operations to `a` or (`b` and `c`) by defining convenience structs that refer to the already allocated `a,b,c` from my βeverythingβ struct.

``````mutable struct Everything{T,VT<:AbstractVector{T}}
a::Ref{T}
b::VT
c::VT
end
mutable struct Parta{T<:AbstractFloat}
a::Ref{T}
end
function Parta(Ev::Everything)
return Parta(Ev.a)
end
function (Pa::Parta)()
Pa.a[] += 1.0
end
mutable struct Partbc{T,VT<:AbstractVector{T}}
b::VT
c::VT
end
function Partbc(Ev::Everything)
return Partbc(Ev.b, Ev.c)
end
function (Pbc::Partbc)()
Pbc.b .+= 2.0
Pbc.c .+= 3.0
end
``````

Now initializing the example:

``````a0 = Ref{Float64}(0.0)
b0 = randn(5)
c0 = randn(5)
a = deepcopy(a0)
b = copy(b0)
c = copy(c0)
Ev = Everything(a, b, c)
Pa = Parta(Ev)
Pbc = Partbc(Ev)
Pa()
Pbc()
``````

my understanding is that `Pa` and `Pbc` just point to `a,b,c`, so thatβs why we can expect `Pbc.b-b` to be zero but `Pbc.b-b0` to be 2, e.g.,

``````julia> hcat(Pbc.b-b, Pbc.b-b0)
5Γ2 Matrix{Float64}:
0.0  2.0
0.0  2.0
0.0  2.0
0.0  2.0
0.0  2.0
``````

When I make everything immutable structs, it works the same (as expected)

``````struct immEverything{T,VT<:AbstractVector{T}}
a::Ref{T}
b::VT
c::VT
end
struct immParta{T<:AbstractFloat}
a::Ref{T}
end
function immParta(immEv::immEverything)
return immParta(immEv.a)
end
function (Pa::immParta)()
Pa.a[] += 1.0
end
struct immPartbc{T,VT<:AbstractVector{T}}
b::VT
c::VT
end
function immPartbc(immEv::immEverything)
return immPartbc(immEv.b, immEv.c)
end
function (Pbc::immPartbc)()
Pbc.b .+= 2.0
Pbc.c .+= 3.0
end
immEv = immEverything(a, b, c)
immPa = immParta(immEv)
immPbc = immPartbc(immEv)
immPa()
immPbc()

julia> hcat(immPbc.b-b, immPbc.b-b0)
5Γ2 Matrix{Float64}:
0.0  4.0
0.0  4.0
0.0  4.0
0.0  4.0
0.0  4.0
``````

For this example, I am curious when I should use mutable vs immutable structs. The docs say

Mutable values, on the other hand are heap-allocated and passed to functions as pointers to heap-allocated values

but I am not sure how `immutable` values are passed to functions (does this happen when calling `immPbc`?). In my case is it a pointer to the (stack?)-allocated values `a,b,c`? It seems that I should use immutable structs here, so Iβm curious about a simple modification to this example that shows when I would need to use `mutable` structs.

In general, julia uses βpass by sharingβ, so if you pass an object (either mutable or immutable), the function semantically has access to the same exact object. How this happens exactly depends on a lot of factors, in part whether the called function was inlined or not. Immutable objects just have the advantage that they are (sometimes) able to be shared by just copying the parts that you actually need (such as an `isbitstype` field - it can only be read after all).

Regardless, generally mutable values are passed by a pointer to the object, because you may modify one of its fields, which must be reflected in all other contexts that have access to that object, be that other functions, closures or other things that have a (for this purpose) permanent lifetime.

Whether the individual fields of e.g. `Everything` are stored inline with an instance of that type or not depends partially on whether or not the type of that field is `isbitstype` - generally, arrays are stored as pointers, as are `Ref`s (since those objects are mutable and changes to their contents must be visible outside any potential objects wrapping them).

Depending on the field type, that may be a bit awkward (and probably suboptimal for performance), since youβll just end up chasing pointers around. E.g. your `mutable struct Parta` is likely to be stored as a pointer, containing a pointer to the final value that you care about - this is no better than just having a field `a::Everything`, and strictly worse than just passing that `Everything` directly, which is only a single pointer deref (and field offset, which is a constant and will likely be added to the pointer directly for dereferencing).

In general, your usecase seems a bit rare, or at least I havenβt encountered this use of explicitly wrapping a `Float64` in a `Ref` a lot. To me it seems much more prudent to just have `mutable struct Everything` with an `a::Float64` field, pass that object around and modify the `a` field directly - the struct is already mutable after all and doing it like that is just a single pointer deref, instead of multiple. Maybe Iβm missing something?

2 Likes

do you mean that in general it would be better to create functions like:

``````struct Everything{T<:Real,VT<:AbstractVector{T}}
a::Ref{T}
b::VT
c::VT
end
function update_a!(Ev::Everything)
Ev.a[] += 1.0
end
function update_bc!(Ev::Everything)
Ev.b .+= 2.0
Ev.c .+= 3.0
end
``````

If so, how does sharing a larger object (with βmanyβ unrelated fields, as in the case of `update_a!`) affect performance? I guess it would be best to write `update_a!` for a `Ref{T}` and then pass it `Ev.a`?

Note that `Ref{T}` is actually an abstract type. In a struct field, you want a `Base.RefValue{T}` to avoid performance penalties. I.e., note that

``````julia> Ref(1234)
Base.RefValue{Int64}(1234)
``````
4 Likes

It still works the same conceptually, but again, depending on how this is used, the assembly can look totally different:

`Everything` and the `Ref` don't escape
``````julia> function f()
e = Everything(Ref(1.0), Float64[], Float64[])
update_a!(e)
e.a[]
end
f (generic function with 1 method)

julia> @code_native f()
.text
.file	"f"
.section	.rodata.cst8,"aM",@progbits,8
.p2align	3                               # -- Begin function julia_f_549
.LCPI0_0:
.text
.globl	julia_f_549
.p2align	4, 0x90
.type	julia_f_549,@function
julia_f_549:                            # @julia_f_549
; β @ REPL[20]:1 within `f`
.cfi_startproc
# %bb.0:                                # %top
#APP
mov	rax, qword ptr fs:[0]
#NO_APP
mov	rax, qword ptr [rax - 8]
mov	rax, qword ptr [rax + 16]
mov	rax, qword ptr [rax + 16]
#MEMBARRIER
mov	rax, qword ptr [rax]
movabs	rax, offset .LCPI0_0
#MEMBARRIER
vmovsd	xmm0, qword ptr [rax]           # xmm0 = mem[0],zero
; β @ REPL[20]:4 within `f`
ret
.Lfunc_end0:
.size	julia_f_549, .Lfunc_end0-julia_f_549
.cfi_endproc
; β
# -- End function
.section	".note.GNU-stack","",@progbits
``````
The `Ref` escapes
``````julia> function f()
e = Everything(Ref(1.0), Float64[], Float64[])
update_a!(e)
e.a
end
f (generic function with 1 method)

julia> @code_native f()
.text
.file	"f"
.globl	julia_f_560                     # -- Begin function julia_f_560
.p2align	4, 0x90
.type	julia_f_560,@function
julia_f_560:                            # @julia_f_560
; β @ REPL[22]:1 within `f`
.cfi_startproc
# %bb.0:                                # %top
sub	rsp, 8
.cfi_def_cfa_offset 16
#APP
mov	rax, qword ptr fs:[0]
#NO_APP
mov	rax, qword ptr [rax - 8]
; β @ REPL[22]:2 within `f`
; ββ @ refpointer.jl:136 within `Ref`
; βββ @ refvalue.jl:10 within `RefValue` @ refvalue.jl:8
mov	esi, 1136
mov	edx, 16
mov	rcx, qword ptr [rax + 16]
mov	rcx, qword ptr [rcx + 16]
#MEMBARRIER
; βββ
; β @ REPL[22]:1 within `f`
mov	rcx, qword ptr [rcx]
#MEMBARRIER
; β @ REPL[22]:2 within `f`
; ββ @ refpointer.jl:136 within `Ref`
; βββ @ refvalue.jl:10 within `RefValue` @ refvalue.jl:8
mov	rdi, qword ptr [rax + 16]
movabs	rax, offset ijl_gc_pool_alloc
call	rax
movabs	rcx, 139865608574432
movabs	rdx, 4611686018427387904
mov	qword ptr [rax - 8], rcx
; βββ
; β @ REPL[22]:3 within `f`
; ββ @ REPL[1]:2 within `update_a!`
; βββ @ refvalue.jl:57 within `setindex!`
; ββββ @ Base.jl:38 within `setproperty!`
mov	qword ptr [rax], rdx
; ββββ
; β @ REPL[22]:4 within `f`
pop	rcx
.cfi_def_cfa_offset 8
ret
.Lfunc_end0:
.size	julia_f_560, .Lfunc_end0-julia_f_560
.cfi_endproc
; β
# -- End function
.type	.L_j_const1,@object             # @_j_const1
.section	.rodata.cst8,"aM",@progbits,8
.p2align	3
.L_j_const1:
.size	.L_j_const1, 8

.section	".note.GNU-stack","",@progbits
``````

You can ignore the stuff before the `#MEMBARRIER`, thatβs just an artifact from GC safepointing and how `@code_native` prints assembly.

Note how thereβs a call to `jl_gc_alloc` in the second version to allocate the `Ref`, where the first just returns a constant. So how the performance changes depends on how `Everything` ends up being used in your code, but in general (assuming a `Ref` field for `a` and the allocation for `Everything` canβt be elided and it ends up on the heap), it may look like this:

Full output
``````julia> struct Everything{T<:Real,VT<:AbstractVector{T}}
a::Base.RefValue{T}
b::VT
c::VT
end

julia> function update_a!(Ev::Everything)
Ev.a[] += 1.0
end
update_a! (generic function with 1 method)

julia> function update_bc!(Ev::Everything)
Ev.b .+= 2.0
Ev.c .+= 3.0
end
update_bc! (generic function with 1 method)

julia> function f(e)
update_a!(e)
update_bc!(e)
end
f (generic function with 1 method)

julia> code_native(f, (Everything{Float64, Vector{Float64}},))
.text
.file	"f"
.section	.rodata.cst8,"aM",@progbits,8
.p2align	3                               # -- Begin function julia_f_176
.LCPI0_0:
.text
.globl	julia_f_176
.p2align	4, 0x90
.type	julia_f_176,@function
julia_f_176:                            # @julia_f_176
; β @ REPL[8]:1 within `f`
.cfi_startproc
# %bb.0:                                # %top
sub	rsp, 8
.cfi_def_cfa_offset 16
#APP
mov	rax, qword ptr fs:[0]
#NO_APP
mov	rax, qword ptr [rax - 8]
movabs	rcx, offset .LCPI0_0
mov	rax, qword ptr [rax + 16]
mov	rax, qword ptr [rax + 16]
#MEMBARRIER
mov	rax, qword ptr [rax]
#MEMBARRIER
; β @ REPL[8]:2 within `f`
; ββ @ REPL[2]:2 within `update_a!`
; βββ @ Base.jl:37 within `getproperty`
mov	rax, qword ptr [rdi]
; βββ
; βββ @ refvalue.jl:56 within `getindex`
; ββββ @ Base.jl:37 within `getproperty`
vmovsd	xmm0, qword ptr [rax]           # xmm0 = mem[0],zero
; ββββ
; βββ @ float.jl:408 within `+`
vaddsd	xmm0, xmm0, qword ptr [rcx]
; βββ
; βββ @ refvalue.jl:57 within `setindex!`
; ββββ @ Base.jl:38 within `setproperty!`
vmovsd	qword ptr [rax], xmm0
; ββββ
; β @ REPL[8]:3 within `f`
movabs	rax, offset "j_update_bc!_178"
call	rax
pop	rcx
.cfi_def_cfa_offset 8
ret
.Lfunc_end0:
.size	julia_f_176, .Lfunc_end0-julia_f_176
.cfi_endproc
; β
# -- End function
.section	".note.GNU-stack","",@progbits
``````

The relevant section of which is

``````	movabs	rcx, offset .LCPI0_0
; β @ REPL[8]:2 within `f`
; ββ @ REPL[2]:2 within `update_a!`
; βββ @ Base.jl:37 within `getproperty`
mov	rax, qword ptr [rdi]
; βββ
; βββ @ refvalue.jl:56 within `getindex`
; ββββ @ Base.jl:37 within `getproperty`
vmovsd	xmm0, qword ptr [rax]           # xmm0 = mem[0],zero
; ββββ
; βββ @ float.jl:408 within `+`
vaddsd	xmm0, xmm0, qword ptr [rcx]
; βββ
; βββ @ refvalue.jl:57 within `setindex!`
; ββββ @ Base.jl:38 within `setproperty!`
vmovsd	qword ptr [rax], xmm0
; ββββ
; β @ REPL[8]:3 within `f`
movabs	rax, offset "j_update_bc!_178"
call	rax
pop	rcx
.cfi_def_cfa_offset 8
ret
``````

So first, the offset of the constant `.LCPI0_0` (which holds `1.0`) is loaded into `rcx`. Then, the pointer of the `Ref` is loaded into `rax`, using the pointer to the `Everything` object passed in via `rdi`. That is then derefed and stored in `xmm0`. Finally, the constant is added (`vdaddsd`) and stored back (`vmovsd`). `update_bc!` did not end up being inlined, so its offset is loaded and the function is called.

Compare this to the same code using a `mutable` struct and no `Ref` at all:

Full code
``````julia> mutable struct Everything{T<:Real,VT<:AbstractVector{T}}
a::T
b::VT
c::VT
end

julia> function update_a!(Ev::Everything)
Ev.a += 1.0
end
update_a! (generic function with 1 method)

julia> function update_bc!(Ev::Everything)
Ev.b .+= 2.0
Ev.c .+= 3.0
end
update_bc! (generic function with 1 method)

julia> function f(e)
update_a!(e)
update_bc!(e)
end
f (generic function with 1 method)

julia> code_native(f, (Everything{Float64, Vector{Float64}},))
.text
.file	"f"
.section	.rodata.cst8,"aM",@progbits,8
.p2align	3                               # -- Begin function julia_f_93
.LCPI0_0:
.text
.globl	julia_f_93
.p2align	4, 0x90
.type	julia_f_93,@function
julia_f_93:                             # @julia_f_93
; β @ REPL[4]:1 within `f`
.cfi_startproc
# %bb.0:                                # %top
sub	rsp, 8
.cfi_def_cfa_offset 16
#APP
mov	rax, qword ptr fs:[0]
#NO_APP
mov	rax, qword ptr [rax - 8]
mov	rax, qword ptr [rax + 16]
mov	rax, qword ptr [rax + 16]
#MEMBARRIER
mov	rax, qword ptr [rax]
#MEMBARRIER
; β @ REPL[4]:2 within `f`
; ββ @ REPL[2]:2 within `update_a!`
; βββ @ Base.jl:37 within `getproperty`
vmovsd	xmm0, qword ptr [rdi]           # xmm0 = mem[0],zero
movabs	rax, offset .LCPI0_0
; βββ
; βββ @ float.jl:408 within `+`
vaddsd	xmm0, xmm0, qword ptr [rax]
; βββ
; β @ REPL[4]:3 within `f`
movabs	rax, offset "j_update_bc!_95"
; β @ REPL[4]:2 within `f`
; ββ @ REPL[2]:2 within `update_a!`
; βββ @ Base.jl:38 within `setproperty!`
vmovsd	qword ptr [rdi], xmm0
; βββ
; β @ REPL[4]:3 within `f`
call	rax
pop	rcx
.cfi_def_cfa_offset 8
ret
.Lfunc_end0:
.size	julia_f_93, .Lfunc_end0-julia_f_93
.cfi_endproc
; β
# -- End function
.section	".note.GNU-stack","",@progbits
``````

The relevant part of which is

``````; β @ REPL[4]:2 within `f`
; ββ @ REPL[2]:2 within `update_a!`
; βββ @ Base.jl:37 within `getproperty`
vmovsd	xmm0, qword ptr [rdi]           # xmm0 = mem[0],zero
movabs	rax, offset .LCPI0_0
; βββ
; βββ @ float.jl:408 within `+`
vaddsd	xmm0, xmm0, qword ptr [rax]
; βββ
; β @ REPL[4]:3 within `f`
movabs	rax, offset "j_update_bc!_95"
; β @ REPL[4]:2 within `f`
; ββ @ REPL[2]:2 within `update_a!`
; βββ @ Base.jl:38 within `setproperty!`
vmovsd	qword ptr [rdi], xmm0
; βββ
; β @ REPL[4]:3 within `f`
call	rax
pop	rcx
.cfi_def_cfa_offset 8
ret
``````

Where we can just load the field `a` directly (`vmovsd xmm0, qword ptr [rdi]`) without having to chase an additional pointer through the `Ref`. In a hot loop, that can make a difference, since you have one less potential for a cache miss through the additional pointer load.

Unless you have a really good (semantic!) reason to require the field `a` to have the ability to live longer than the `Everything` object, wrapping it in a `Ref` like that is in general very likely to be strictly worse than just having a `mutable` struct with no `Ref` at all, since `Float64` is an `isbitstype` (i.e. pointer-free) that is stored inline with what its being wrapped by.

Maybe Iβm missing something about how your mental model of julia works though. For example, this part:

doesnβt necessarily make sense depending on the type of `a` since we donβt generally say that we βallocateβ a `Float64`. Itβs an immutable value, most of the time just being passed via registers (if at all). There is no ability to take a pointer to an immutable value that landed on the stack - or rather, there is no semantic guarantee that a `Ref` to a `Float64` ends up as a pointer to a previously used location on the stack (though it may happen if the compiler deems it safeβ¦ the more likely case is for the `Ref` to be heap allocated and tracked by GC though, since itβs mutable and a mutable values that escape the function theyβre defined in are guaranteed to be safe when allocated on the heap).

So from that POV, having a βpreviously allocated `RefValue{Float64}`β kind of begs the question why that `RefValue` needs to exist like that in the first place, when the `Everything` struct seems to be the owner of that value. The convenience structs donβt really gain you anything performance or codesize wise, since they too are either passed as a pointer (if theyβre mutable and they escape the scope theyβre created in) or are just eliminated entirely (likely to happen if theyβre immutable and just hold the `Ref`), at which point either passing the `Everything` or the `Ref`-field directly achieves the same thing, with less code & less mental overhead about which wrapper structs refer to which `Everything` object.

To circle back around to whether you should use mutable or immutable structs for your `Everything` - to me, there isnβt enough information here to definitively recommend one over another, if any, specifically these two versions:

``````struct Everything{T,VT<:AbstractVector{T}}
a::Base.RefValue{T}
b::VT
c::VT
end

mutable struct Everything{T,VT<:AbstractVector{T}}
a::T
const b::VT
const c::VT
end
``````

The reason being that these two are expressing different intentions with regards to what `a` means and how itβs used in your program (though personally Iβd much prefer the mutable version here, purely due to not wanting to write `ev.a[]` everywhere and I wouldnβt expect the `a` to live longer than `Everything`).

2 Likes