I am trying to better understand the performance implications of using mutable structs. My actual use case is similar to the Optim package of storing a state variable that will change every iteration (such as https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/multivariate/solvers/first_order/cg.jl).
My concern is that accessing variables in a structure is more expensive than if those variables were direct inputs to a function. I am still learning the details of @btime
, so the code below shows options with and without interpolating the variables and with the referencing and dereferencing trick from the BenchmarkTools documentation (GitHub - JuliaCI/BenchmarkTools.jl: A benchmarking framework for the Julia language).
MWE of a mix of functions with typed vs. non-typed inputs and the inputs being structs or direct variables.
using BenchmarkTools
mutable struct mystruct
myint::Int
mysym::Symbol
end
mutable struct mystruct_nottyped
myint
mysym
end
s = mystruct(1, :randomsymbol)
s_nottyped = mystruct_nottyped(1, :randomsymbol)
function accessfromstruct(s::mystruct)
res = 0 # this should allocate an int
for _ = 1:100 # force the function to take longer
if s.mysym == :randomsymbol # testing reading a field in the structure
res += 1
s.myint += res # testing setting a field in the structure
end
end
return res, s
end
function accessfromstruct_nottyped(s) # same as above, but input to function is not typed
res = 0
for _ = 1:100
if s.mysym == :randomsymbol
res += 1
s.myint += res
end
end
return res, s
end
accessfromstruct(s) # precompile
accessfromstruct_nottyped(s); accessfromstruct_nottyped(s_nottyped)
@btime accessfromstruct(s) # 143 ns, 0 bytes
@btime accessfromstruct_nottyped(s) # 54 ns, 16 bytes
@btime accessfromstruct_nottyped(s_nottyped) # 2.8 us, 1.6 KB
# it makes sense the last one is the worst since nothing is typed
# but why is the second faster than the first?
@btime accessfromstruct($s) # 31 ns, 0 bytes
@btime accessfromstruct_nottyped($s) # 31 ns, 0 bytes
@btime accessfromstruct_nottyped($s_nottyped) # 2.8 us, 1.6 KB
function accessdirectly(i::Int, s::Symbol)
res = 0
for _ = 1:100
if s == :randomsymbol
res += 1
i += res
end
end
return res, i
end
mysym = :randomsymbol; myint = 1
accessdirectly(myint, mysym)
@btime accessdirectly(myint, mysym) # 4 ns, 0 bytes
@btime accessdirectly($(Ref(myint))[], $(Ref(mysym))[]) # 1 ns, 0 bytes
function accessdirectly_nottyped(i, s)
res = 0
for _ = 1:100
if s == :randomsymbol
res += 1
i += res
end
end
return res, i
end
accessdirectly_nottyped(myint, mysym)
@btime accessdirectly_nottyped(myint, mysym) # 18 ns, 32 bytes
@btime accessdirectly_nottyped($myint, $mysym) # 1 ns, 0 bytes
@btime accessdirectly($(Ref(myint))[], $(Ref(mysym))[]) # 1 ns, 0 bytes
Specific points and questions:
- Since the function is allocating, I am guessing any btime result with 0 allocations is not properly measured
- The most concerning comparison to me is
@btime accessfromstruct_nottyped(s) # 54 ns, 16 bytes
versus
@btime accessdirectly_nottyped(myint, mysym) # 18 ns, 32 bytes
.
If both functions are not typed, why does using the struct lead to longer run-time? - Ideally, I’d like to use something like
accessfromstruct(s)
where the struct fields are typed and the function inputs are too. I’m not sure I’m running btime correctly for that case, but the typed function with typed, non-struct inputs (@btime accessdirectly(myint, mysym)
) might do better than the typed function with a typed, struct input. Is there a better way to test these with btime?