Need help on performance of struct with integer parametric-types.
I have the following piece of code for an AD package I am developing.
PGrad
is a struct for tracking gradients w.r.t N_v variables in N_c cells.
PVar
is a struct for variables.
using StaticArrays
using BenchmarkTools
struct PGrad{Nv, Nc}
ind::SVector{Nc, Int}
grad::SVector{Nc, SVector{Nv,Float64}}
end
struct PVar{T<:PGrad}
val::Float64
grad::T
end
Construction of one PVar{PGrad{2,1}}
takes
pg1 = PGrad{2,1}(SA[1], SA[SA[1.0, 0.0]])
@btime PVar{PGrad{2,1}}(1.0, pg1)
70.569 ns (1 allocation: 48 bytes)
If I explicitly write
struct CGrad
ind::SVector{1, Int}
grad::SVector{1, SVector{2,Float64}}
end
struct CVar
val::Float64
grad::CGrad
end
Construction becomes much faster
cg1 = CGrad(SA[1], SA[SA[1.0, 0.0]])
@btime CVar(1.0, cg1)
7.000 ns (0 allocations: 0 bytes)
If I annotate types of grad, time for both cases go below ~5ns
ug1 = PGrad{2,1}(SA[1], SA[SA[1.0, 0.0]])
@btime PVar{PGrad{2,1}}(1.0, ug1::PGrad{2,1})
4.800 ns (0 allocations: 0 bytes)
cg1 = CGrad(SA[1], SA[SA[1.0, 0.0]])
@btime CVar(1.0, cg1::CGrad)
4.500 ns (0 allocations: 0 bytes)
Any idea what causes the difference in timing and number of allocations?
Maybe I am not using integer parameteric-type the correct way.