Hello.
I am (still) trying to make sense on how to efficiently traverse a vector containing static vectors and matrices and doing operations on them. This is in the context of molecular dynamics, where I have N particles, with vectors on them (positions, velocities), and matrices (e.g. inertia tensors).
So, to compute the norm of all vectors I have the naïve implementation (function r2) , which is greatly sped-up with the simple use of an inner variable r_i = r[i] (function r2_v2). Using a regular vector for this (function r2_v3) also speeds things up, but not as greatly. All this, due to fewer allocations, I guess.
174.579 μs (4490 allocations: 140.58 KiB)
589.067 ns (1 allocation: 7.94 KiB)
8.175 μs (2 allocations: 8.02 KiB)
But then, when I compute the operation r·J·r (which I need, long story…), these tricks do not work. The number of allocations does not go down. Why?
32.308 μs (1490 allocations: 31.20 KiB)
46.945 μs (1979 allocations: 38.84 KiB)
704.404 μs (6493 allocations: 328.33 KiB)
Here is the code. Thanks for any hints whatsoever.
using StaticArrays
using LinearAlgebra
using BenchmarkTools
N = 1000 ;
rr = Vector{SVector{2}}(undef,N);
for i in 1:N rr[i] = rand(2) end
function r2( r )
N = length(r)
rr = Vector{Float64}(undef,N)
for i in 1:N
rr[i] = dot( r[i] , r[i] )
end
return rr
end
@btime r2($rr) ;
function r2_v2( r )
N = length(r)
rr = Vector{Float64}(undef,N)
for i in 1:N
r_i = rr[i]
rr[i] = dot( r_i , r_i )
end
return rr
end
@btime r2_v2($rr) ;
function r2_v3( r )
N = length(r)
rr = Vector{Float64}(undef,N)
r_i = zeros(2)
for i in 1:N
r_i .= rr[i]
rr[i] = dot( r_i , r_i )
end
return rr
end
@btime r2_v3($rr) ;
JJ = Vector{SMatrix{2, 2}}(undef,N);
for i in 1:N JJ[i] = rand(2,2) end
function rJr( r , J )
N = length(r)
rjr = Vector{Float64}(undef,N)
for i in 1:N
rjr[i] = dot( r[i] , J[i], r[i] )
end
return rjr
end
@btime rJr($rr , $JJ) ;
function rJr_v2( r , J )
N = length(r)
rjr = Vector{Float64}(undef,N)
for i in 1:N
r_i = rr[i]
J_i = J[i]
rjr[i] = dot( r_i , J_i, r_i )
end
return rjr
end
@btime rJr_v2($rr , $JJ) ;
function rJr_v3( r , J )
N = length(r)
rjr = Vector{Float64}(undef,N)
r_i = zeros(2)
J_i = zeros(2,2)
Jr_i = zeros(2)
for i in 1:N
r_i .= rr[i]
J_i .= J[i]
Jr_i.= J_i * r_i
rjr[i] = dot( r_i , Jr_i )
end
return rjr
end
@btime rJr_v3($rr , $JJ) ;