Hi everyone,
I am looking for suggestions to speed up the following operation, which I need to frequently evaluate to construct a Jacobian matrix, and which accounts for approx. 80% of computation time for said Jacobian.
The expression I am evaluating is (S .* (S * (d .* S))) * X
where
- S is a n x n symmetric matrix
- d is a n x 1 column vector
- X is a n x p matrix
- n>>p, e.g. n = 100, p = 10
I tried a naive loop-implementation and one using LoopVectorization and Tullio, but both are substantially slower than the naive evaluation of my expression. Below is a code example:
using BenchmarkTools, Random, Tullio, LoopVectorization;
Random.seed!(1);
n = 100;
p = 10;
S = randn(n,n);
S = S'*S;
d = randn(n);
X = randn(n,p);
function naive_loop(S::Array{T,2},d::Vector{T},X::Array{T,2}) where T
out = similar(X)
@inbounds for i in 1:size(S,1)
@inbounds for l in 1:size(X,2)
out[i,l] = naive_inner_loop(S,d,X,i,l)
end
end
out
end
function naive_inner_loop(S::Array{T,2},d::Vector{T},X::Array{T,2},i::Int,l::Int) where T
r = zero(T)
@inbounds for j in 1:size(S,1)
@inbounds for k in 1:size(S,1)
r += S[i,j]*S[i,k]*S[j,k]*d[k]*X[j,l]
end
end
r
end
function tullio_loop(S::Array{T,2},d::Vector{T},X::Array{T,2}) where T
@tullio out[i,l] := @inbounds S[j,i]*S[k,i]*S[j,k]*d[k]*X[j,l]
end
@assert naive_loop(S,d,X) ≈ tullio_loop(S,d,X);
@assert tullio_loop(S,d,X) ≈ (S .* (S * (d .* S))) * X ;
@btime naive_loop($S,$d,$X) ;
# 10.543 ms (1 allocation: 7.94 KiB)
@btime tullio_loop($S,$d,$X) ;
# 1.306 ms (50 allocations: 10.56 KiB)
@btime ($S .* ($S * ($d .* $S))) * $X;
# 81.802 μs (7 allocations: 242.45 KiB)
Any help on how to speed up this operation is much appreciated.
Best regards,
Philipp