Time cost for using structs

I am trying to better understand the performance implications of using mutable structs. My actual use case is similar to the Optim package of storing a state variable that will change every iteration (such as https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/multivariate/solvers/first_order/cg.jl).

My concern is that accessing variables in a structure is more expensive than if those variables were direct inputs to a function. I am still learning the details of @btime, so the code below shows options with and without interpolating the variables and with the referencing and dereferencing trick from the BenchmarkTools documentation (GitHub - JuliaCI/BenchmarkTools.jl: A benchmarking framework for the Julia language).

MWE of a mix of functions with typed vs. non-typed inputs and the inputs being structs or direct variables.

using BenchmarkTools

mutable struct mystruct
	myint::Int
	mysym::Symbol
end
mutable struct mystruct_nottyped
	myint
	mysym
end

s = mystruct(1, :randomsymbol)
s_nottyped = mystruct_nottyped(1, :randomsymbol)

function accessfromstruct(s::mystruct)
	res = 0	  # this should allocate an int 
	for _ = 1:100 # force the function to take longer
		if s.mysym == :randomsymbol # testing reading a field in the structure 
			res += 1
			s.myint += res # testing setting a field in the structure 
		end
	end
	return res, s 
end
function accessfromstruct_nottyped(s) # same as above, but input to function is not typed
	res = 0
	for _ = 1:100
		if s.mysym == :randomsymbol
			res += 1
			s.myint += res
		end
	end
	return res, s
end
accessfromstruct(s) # precompile 
accessfromstruct_nottyped(s); accessfromstruct_nottyped(s_nottyped)

@btime accessfromstruct(s) 								# 143 ns, 0 bytes
@btime accessfromstruct_nottyped(s)						#  54 ns, 16 bytes
@btime accessfromstruct_nottyped(s_nottyped)			# 2.8 us, 1.6 KB 
# it makes sense the last one is the worst since nothing is typed 
# but why is the second faster than the first?

@btime accessfromstruct($s) 							#  31 ns, 0 bytes
@btime accessfromstruct_nottyped($s)					#  31 ns, 0 bytes
@btime accessfromstruct_nottyped($s_nottyped)			# 2.8 us, 1.6 KB

function accessdirectly(i::Int, s::Symbol)
	res = 0
	for _ = 1:100
		if s == :randomsymbol
			res += 1
			i += res
		end
	end
	return res, i
end
mysym = :randomsymbol; myint = 1
accessdirectly(myint, mysym)
@btime accessdirectly(myint, mysym)						#  4 ns, 0 bytes 
@btime accessdirectly($(Ref(myint))[], $(Ref(mysym))[]) #  1 ns, 0 bytes 

function accessdirectly_nottyped(i, s)
	res = 0
	for _ = 1:100
		if s == :randomsymbol
			res += 1
			i += res
		end
	end
	return res, i
end
accessdirectly_nottyped(myint, mysym)
@btime accessdirectly_nottyped(myint, mysym) 			# 18 ns, 32 bytes 
@btime accessdirectly_nottyped($myint, $mysym)			#  1 ns, 0 bytes
@btime accessdirectly($(Ref(myint))[], $(Ref(mysym))[]) #  1 ns, 0 bytes

Specific points and questions:

  • Since the function is allocating, I am guessing any btime result with 0 allocations is not properly measured
  • The most concerning comparison to me is
    @btime accessfromstruct_nottyped(s) # 54 ns, 16 bytes versus
    @btime accessdirectly_nottyped(myint, mysym) # 18 ns, 32 bytes.
    If both functions are not typed, why does using the struct lead to longer run-time?
  • Ideally, I’d like to use something like accessfromstruct(s) where the struct fields are typed and the function inputs are too. I’m not sure I’m running btime correctly for that case, but the typed function with typed, non-struct inputs (@btime accessdirectly(myint, mysym)) might do better than the typed function with a typed, struct input. Is there a better way to test these with btime?

I don’t see any allocations in your code, so the @btime values for allocations are probably correct.

You need to type struct fields but not function input arguments, since the compiler performs type inference based on the supplied inputs. In most cases there is zero performance advantage to typing function argument signatures.

Type your struct fields, typing function signature is optional. Also, consider whether to use a non-mutable struct. If you can there can be possibilities for optimizations. I would in most cases expect zero overhead for using a type stable struct.

BTW, there’s no reason to write separate functions accessfromstruct and accessfromstruct, they are identical. Just pass the structs with typed and with untyped fields, respectively, and the compiler compiles specialized methods for each.

Yeah, I think there are just a lot of confounding things going on in your benchmark which make this hard to interpret:

  • You’re doing @btime accessdirectly_nottyped(myint, mysym), which treats myint and mysim as global variables. All of the allocations are coming from that fact–you need $myint, $mysym to avoid this.
  • I think the compiler is outsmarting your benchmark code. Since s == :randomsymbol is always true or always false, I bet the compiler is moving that check outside the loop and optimizing away most of your code, which is why it returns a suspiciously low time of 1ns

As a general rule, there is no cost in accessing a (typed) field of a struct vs. accessing a local variable. Giving types to your function arguments has no effect on this one way or another. So please feel free to use structs to organize your data if it makes your life easier.

4 Likes