Okay, a few things here:
Your test with a literal 3 in it is broken, since the compiler just constant-propagates it away if possible, better make the number you assign random or an argument. Secondly, there’s a thing called function barrier. In short, you want to separate the parts of a function in which the types are known/inferrable and don’t change from the rest. See the shielded_test
function below. Ideally, you want to go one step further and extract the parameters you need from the struct once and feed them to an inner function, which will then be faster, because it doesn’t need to extract fields from a tuple in each iteration. See proper_test
in the code below. If you do that cleverly, storing mixed types there is totally fine if you help the compiler figure out what’s going on.
using BenchmarkTools
mutable struct T1
pm::NamedTuple{(:b, ),Tuple{Int64}} #This definition is fast, but it can only accept one parameter
function T1()
return new()
end
end
mutable struct T2
pm::NamedTuple{names,T} where {names, T <: Tuple{Vararg{Int64}}}
function T2()
return new()
end
end
mutable struct T3
pm::NamedTuple{names,T} where {names, T <: Tuple} #)#This can accept multiple parameters, but the speed is very slow, and there will be a lot of extra memory allocation
function T3()
return new()
end
end
function test(g)
g.pm=(b=3,)
cc = 0
nt=g.pm
for i = 1:10^8
cc = cc + nt.b
end
return(g.pm, cc)
end
function shielded_test(g)
g.pm = (b=rand(Int),)
cc = 0
nt = g.pm
return (g.pm, _inner_shielded_test(nt,cc))
end
function _inner_shielded_test(nt,cc)
for i = 1:10^8
cc = cc + nt.b
end
return cc
end
function proper_test(g)
g.pm = (b=rand(Int),)
cc = 0
nt = g.pm
return (g.pm, _inner_proper_test(nt.b,cc))
end
function _inner_proper_test(b,cc)
for i = 1:10^8
cc = cc + b
end
return cc
end
println("Previous test:")
b1 = @benchmark test(g) setup = g = T1()
b2 = @benchmark test(g) setup = g = T2()
b3 = @benchmark test(g) setup = g = T3()
println("""
T1: Time: $(string(b1)), Allocs: $(b1.allocs)
T2: Time: $(string(b2)), Allocs: $(b2.allocs)
T3: Time: $(string(b3)), Allocs: $(b3.allocs)
""")
println("Shielded test:")
b1 = @benchmark shielded_test(g) setup = g = T1()
b2 = @benchmark shielded_test(g) setup = g = T2()
b3 = @benchmark shielded_test(g) setup = g = T3()
println("""
T1: Time: $(string(b1)), Allocs: $(b1.allocs)
T2: Time: $(string(b2)), Allocs: $(b2.allocs)
T3: Time: $(string(b3)), Allocs: $(b3.allocs)
""")
println("Proper test:")
b1 = @benchmark proper_test(g) setup = g = T1()
b2 = @benchmark proper_test(g) setup = g = T2()
b3 = @benchmark proper_test(g) setup = g = T3()
println("""
T1: Time: $(string(b1)), Allocs: $(b1.allocs)
T2: Time: $(string(b2)), Allocs: $(b2.allocs)
T3: Time: $(string(b3)), Allocs: $(b3.allocs)
""")
which gives:
Previous test:
T1: Time: Trial(1.299 ns), Allocs: 0
T2: Time: Trial(1.246 s), Allocs: 2
T3: Time: Trial(3.790 s), Allocs: 99999831
Shielded test:
T1: Time: Trial(9.300 ns), Allocs: 0
T2: Time: Trial(1.563 s), Allocs: 100000003
T3: Time: Trial(97.048 ns), Allocs: 3
Proper test:
T1: Time: Trial(9.201 ns), Allocs: 0
T2: Time: Trial(89.958 ns), Allocs: 4
T3: Time: Trial(114.332 ns), Allocs: 4
Just for reference: T3() is then able to store further fields of arbitrary types without hurting performance:
# pass second argument to the tuple as a field "c"
julia> @benchmark shielded_test(g, "hi there") setup = g = T3()
BenchmarkTools.Trial:
memory estimate: 80 bytes
allocs estimate: 3
--------------
minimum time: 95.679 ns (0.00% GC)
median time: 101.474 ns (0.00% GC)
mean time: 109.132 ns (3.64% GC)
maximum time: 2.803 μs (94.33% GC)
--------------
samples: 10000
evals/sample: 949
julia> @benchmark proper_test(g, Complex(1,2)) setup = g = T3()
BenchmarkTools.Trial:
memory estimate: 112 bytes
allocs estimate: 4
--------------
minimum time: 111.435 ns (0.00% GC)
median time: 120.279 ns (0.00% GC)
mean time: 132.702 ns (6.05% GC)
maximum time: 4.053 μs (96.10% GC)
--------------
samples: 10000
evals/sample: 927