Mutable scalar in immutable object: the best alternative?

I found this old thread about the same thing, but that was before 1.0. Actually while writing this post I figured out most of the details… now I will post it anyway.

If one has an immutable struct with many fields, and wants one of the scalar fields to be mutated, some workarounds involve defining it as a one-element array, a zero-element array, or using Ref. A small example is shown below.

It seems that Ref is faster than the other options. I didn’t expect that in particular relative to an MVector{1,Float64}. Anyway, if anyone has something to comment on that, I appreciate.

using BenchmarkTools, StaticArrays

struct A{T}
  val::T
end

#
# Test with Ref or 0-element array (no index)
#
function test_ref!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    v[i].val[] = v[i].val[] + 0.1
    s += v[i].val[]
  end
  s
end

#
# Test with vectors (first element)
#
function test_vec!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    v[i].val[1] = v[i].val[1] + 0.1
    s += v[i].val[1]
  end
  s
end

# Run

N = 10_000

T = Vector{Float64}
v = A{T}[ A(rand(1)) for i in 1:N ];
print(" Vec: "); @btime test_vec!($v)

T = Array{Float64,0}
v = A{T}[ A{T}(fill(rand())) for i in 1:N ];
print(" Array0: "); @btime test_ref!($v)

T = MVector{1,Float64}
v = A{T}[ A(rand(T)) for i in 1:N ];
print(" MVec: "); @btime test_vec!($v)

T = Base.RefValue{Float64}
v = A{T}[ A{T}(Ref(rand())) for i in 1:N ];
print(" Ref: "); @btime test_ref!($v)



Result:

 Vec:   22.370 μs (0 allocations: 0 bytes)
 Array0:   22.558 μs (0 allocations: 0 bytes)
 MVec:   23.943 μs (0 allocations: 0 bytes)
 Ref:   18.344 μs (0 allocations: 0 bytes)

3 Likes

I am usually using Setfield.jl in this case. It requires to change coding style, i.e.

f!(x)

should be changed to

x = f!(x)

And function f! should return mutated version of an object.

I haven’t encountered yet a case, where such change is impossible, and it solved the problem of using mutable scalars.

4 Likes

Nitpick: it’s “zero-dimensional array”.

2 Likes

Apparently this option has to be used with some caution:

using BenchmarkTools, Setfield

struct A{T}
  val::T
end

#
# Test with Setfield - using pairs
#
function test_set1!(v)
  s = 0.
  @inbounds for (i,el) in pairs(v)
    @set! el.val = el.val + 0.1
    v[i] = el
    s += el.val
  end
  s
end

#
# Test with Setfield, updating directly
#
function test_set2!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    @set! v[i].val = v[i].val + 0.1
    s += v[i].val
  end
  s
end

N = 10_000
T = Float64
v = A{T}[ A{T}(rand()) for i in 1:N ];
print(" @set1: "); @btime test_set1!(v0) setup = (v0=deepcopy($v))
print(" @set2: "); @btime test_set2!(v0)  setup = (v0=deepcopy($v))

Result:

julia> print(" @set1: "); @btime test_set1!(v0) setup = (v0=deepcopy($v))
 @set1:   10.029 μs (0 allocations: 0 bytes)
6030.586662602187

julia> print(" @set2: "); @btime test_set2!(v0)  setup = (v0=deepcopy($v))
 @set2:   67.330 ms (20000 allocations: 763.70 MiB)
6030.586662602187

The first option works great, but it seems that particular care has to be taken in the way we write the loop.

Also, it seems that v[i] = el is replacing the complete struct in that position of vector v. I would guess that at some point (if the struct is large enough) that becomes slow? (like static arrays become unworthy for sizes larger than some limit). I tried to test that but I didn’t reach that limit, and apparently the performance was not dependent on the size of the struct fields (immutable ones, like tuples, if arrays I would not expect any difference, since arrays are not copied).

@Skoffer , sorry bothering you. If you have any comment on this, please let me know.

1 Like

Another option is:

  1. Use a mutable struct
  2. Write methods to read the fields of that struct and to set the one field you actually want to change
  3. Ensure that you use those methods, rather than directly interacting with the struct fields

The advantage of this is that it requires no Ref wrapper; the disadvantage is that it remains possible for a user to reach into your object and do something unexpected with the internal fields.

2 Likes

But if I have a struct with many fields, having the whole struct as mutable might have a performance penalty, doesn’it? I am thinking about a particular case where one of the fields is a scallar and has to be mutated only.

The only reason I can think of would be layout optimization internal changes (support pointers inlining/unboxing into parents/codegen) [disabled] by vtjnash · Pull Request #33886 · JuliaLang/julia · GitHub (new in 1.5). It’s certainly worth actually doing a benchmark, though.

3 Likes

An idea: use a mutable struct, but overload Base.setproperty! to choose by yourself which specific fields are (im)mutable:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.4 (2021-03-11)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia> mutable struct A
               immut::Int
               mut::Float64
       end

julia> obj = A(3, 3.14)
A(3, 3.14)

julia> obj.immut = 7
7

julia> import Base.setproperty!

julia> function setproperty!(arg::A, s::Symbol, new)
               s === :immut && error("access to immut is not allowed")
               invoke(setproperty!, Tuple{Any, Symbol, Any}, arg, s, new)
       end
setproperty! (generic function with 7 methods)

julia> obj.immut = 7
ERROR: access to immut is not allowed
Stacktrace:
 [1] setproperty!(::A, ::Symbol, ::Int64) at ./REPL[5]:2
 [2] top-level scope at REPL[6]:1
 [3] run_repl(::REPL.AbstractREPL, ::Any) at /build/julia/src/julia-1.5.4/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:288

julia> obj.mut = 5.9
5.9

julia> obj
A(7, 5.9)

julia> obj.immut = 9
ERROR: access to immut is not allowed
Stacktrace:
 [1] setproperty!(::A, ::Symbol, ::Int64) at ./REPL[5]:2
 [2] top-level scope at REPL[9]:1
 [3] run_repl(::REPL.AbstractREPL, ::Any) at /build/julia/src/julia-1.5.4/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:288

I’m not sure how well this optimizes, though.

EDIT: Another idea:

julia> setproperty!(::A, ::Symbol, ::Any) = error("use getters and setters to access A objects")
setproperty! (generic function with 7 methods)

and then define some getters and setters. EDIT: actually, being able to define meaningful setters after this would probably require something like a wrapped type, I think. I’m out of my depth :smiley:

I have added the mutable struct option and the @set! option to the tests. In this minimal example using Setfield with @set! turns out to be faster than the other alternatives, but the way one iterates over the elements is very important (I might be doing something wrong when using @set! while iterating over indexes). The other alternatives are not as sensible to the way one is iterating:

Code
using BenchmarkTools, StaticArrays, Setfield

struct A{T}
  val::T
end

mutable struct M{T}
  val::T
end

#
# Test with Ref
#
function test_ref_index!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    v[i].val[] = v[i].val[] + 0.1
    s += v[i].val[]
  end
  s
end

function test_ref_pairs!(v)
  s = 0.
  @inbounds for (i,el) in pairs(v)
    el.val[] = el.val[] + 0.1
    s += el.val[]
    v[i] = el
  end
  s
end

#
# Test with vectors
#
function test_vec_index!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    v[i].val[1] = v[i].val[1] + 0.1
    s += v[i].val[1]
  end
  s
end

function test_vec_pairs!(v)
  s = 0.
  @inbounds for (i,el) in pairs(v)
    el.val[1] = el.val[1] + 0.1
    s += el.val[1]
    v[i] = el
  end
  s
end

#
# Test with Setfield
#
function test_set_index!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    @set! v[i].val = v[i].val + 0.1
    s += v[i].val
  end
  s
end

function test_set_pairs!(v)
  s = 0.
  @inbounds for (i,el) in pairs(v)
    @set! el.val = el.val + 0.1
    v[i] = el
    s += el.val
  end
  s
end

# With mutable struct
function test_mut_index!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    v[i].val = v[i].val + 0.1
    s += v[i].val
  end
  s
end

function test_mut_pairs!(v)
  s = 0.
  @inbounds for (i,el) in pairs(v)
    el.val = el.val + 0.1
    v[i] = el
    s += el.val
  end
  s
end


# Run

N = 10_000

T = Float64
v = M{T}[ M{T}(rand()) for i in 1:N ];
print(" mut_index: "); @btime test_mut_index!($v)
print(" mut_pairs: "); @btime test_mut_pairs!($v)

T = Float64
v = A{T}[ A{T}(rand()) for i in 1:N ];
print(" @set_index: "); @btime test_set_index!($v)
print(" @set_pairs: "); @btime test_set_pairs!($v)

T = Vector{Float64}
v = A{T}[ A(rand(1)) for i in 1:N ];
print(" Vec_index: "); @btime test_vec_index!($v)
print(" Vec_pairs: "); @btime test_vec_pairs!($v)

T = Array{Float64,0}
v = A{T}[ A{T}(fill(rand())) for i in 1:N ];
print(" Array0_index: "); @btime test_ref_index!($v)
print(" Array0_pairs: "); @btime test_ref_pairs!($v)

T = MVector{1,Float64}
v = A{T}[ A(rand(T)) for i in 1:N ];
print(" MVec_index: "); @btime test_vec_index!($v)
print(" MVec_pairs: "); @btime test_vec_pairs!($v)

T = Base.RefValue{Float64}
v = A{T}[ A{T}(Ref(rand())) for i in 1:N ];
print(" Ref_index:: "); @btime test_ref_index!($v)
print(" Ref_pairs: "); @btime test_ref_pairs!($v)



Results:

# mutable struct
 mut_index:   15.854 μs (0 allocations: 0 bytes)
 mut_pairs:   18.482 μs (0 allocations: 0 bytes)
# Setfield
 @set_index:   69.037 ms (20000 allocations: 763.70 MiB) # something wrong?
 @set_pairs:   10.027 μs (0 allocations: 0 bytes) # FASTER
# Standard vector with 1 element
 Vec_index:   22.507 μs (0 allocations: 0 bytes)
 Vec_pairs:   22.157 μs (0 allocations: 0 bytes)
# Array of dimension 0
 Array0_index:   22.971 μs (0 allocations: 0 bytes)
 Array0_pairs:   22.650 μs (0 allocations: 0 bytes)
# Static mutable array of one element
 MVec_index:   20.970 μs (0 allocations: 0 bytes)
 MVec_pairs:   21.036 μs (0 allocations: 0 bytes)
# RefValue
 Ref_index::   18.291 μs (0 allocations: 0 bytes)
 Ref_pairs:   21.491 μs (0 allocations: 0 bytes)

2 Likes

Why not using mutable struct with in an immutable struct?

mutable struct M{T}
  val::T
end

#
struct B{T}
    val::M{T}
end

# mutabl within immutable
function test_mut_immut_index!(v)
  s = 0.
  @inbounds for i in eachindex(v)
    v[i].val.val = v[i].val.val + 0.1
    s += v[i].val.val
  end
  s
end

function test_mut_immut_pairs!(v)
  s = 0.
  @inbounds for (i,el) in pairs(v)
    el.val.val = el.val.val + 0.1
    v[i] = el
    s += el.val.val
  end
  s
end

T = Float64
v = B{T}[ B{T}(M{T}(rand())) for i in 1:N ];
print(" mut_immut_index: "); @btime test_mut_immut_index!($v)
print(" mut_immut_pairs: "); @btime test_mut_immut_pairs!($v)

Results on my computer

julia> print(" mut_immut_index: "); @btime test_mut_immut_index!($v)
 mut_index:   15.900 μs (0 allocations: 0 bytes)
2.9915904161157984e8

julia> print(" mut_immut_pairs: "); @btime test_mut_immut_pairs!($v)
 mut_pairs:   18.600 μs (0 allocations: 0 bytes)
5.658470416076994e8

versus Ref

julia> print(" Ref_index:: "); @btime test_ref_index!($v)
 Ref_index::   14.800 μs (0 allocations: 0 bytes)
3.2040796745854414e8

julia> print(" Ref_pairs: "); @btime test_ref_pairs!($v)
 Ref_pairs:   17.000 μs (0 allocations: 0 bytes)
6.090409674543456e8

This is redundant for this simple example of only one element in a struct, but if your struct is large and contains main mutable/immutable elements it could be an option to use a sub mutable struct and sub immutable struct.

2 Likes