This is a cross-post from an issue I opened in the StaticArrays
package: https://github.com/JuliaArrays/StaticArrays.jl/issues/537
I have been working on a pull-request to fix an issue I discovered with StaticArrays
. I discovered that when one entity that happens to be a column vector (single column of matrix), and the other a row vector, and you compute the outer product of these two vectors, the returned array is not a StaticArray
but a regular allocated array. I stumbled upon this issue whilst implementing a particular cost function. I have stripped the cost function and given it as an example below.
I came up with a proposed fix by adding the following two definitions to matrix_multiply.jl
in the https://github.com/JuliaArrays/StaticArrays.jl package:
@inline *(A::StaticMatrix, B::Adjoint{<:Any, <:StaticVector}) = *(reshape(A, Size(Size(A)[1],)), B)
@inline mul!(dest::StaticVecOrMat, A::StaticMatrix, B::Adjoint{<:Any, <:StaticVector}) = mul!(dest, reshape(A, Size(Size(A)[1],)), B)
I have verified that adding these additional dispatch rules fixes the original problem (the returned array is now a StaticArray
). However, when I am summing the result of these StaticArray
types I am suddenly getting a lot of allocations.
You can reproduce the problem by running the following code snippet:
using StaticArrays, BenchmarkTools, LinearAlgebra
function hom(v::SVector)
push(v,1)
end
function T(๐::AbstractArray, ๐::Tuple{AbstractArray, Vararg{AbstractArray}}, ๐::Tuple{AbstractArray, Vararg{AbstractArray}})
โ = kron
l = 9
๐โ = SMatrix{9,9}(1.0I)
๐โ = SMatrix{1,1}(1.0I)
๐ = @SMatrix zeros(9,9)
N = length(๐[1])
โณ, โณสน = ๐
ฮโ, ฮโ = ๐
๐ฒโ = @MMatrix zeros(4,4)
๐โ = @SMatrix [1.0; 0.0; 0.0]
๐โ = @SMatrix [0.0; 1.0; 0.0]
for n = 1: N
index = SVector(1,2)
๐ฒโ[1:2,1:2] .= ฮโ[n][index,index]
๐ฒโ[3:4,3:4] .= ฮโ[n][index,index]
๐ฆ = hom(โณ[n])
๐ฆสน= hom(โณสน[n])
๐โ = (๐ฆ โ ๐ฆสน)
โโ๐ฎโ = [(๐โ โ ๐ฆสน) (๐โ โ ๐ฆสน) (๐ฆ โ ๐โ) (๐ฆ โ ๐โ)]
๐โ = โโ๐ฎโ * ๐ฒโ * โโ๐ฎโ'
๐บโ = ๐' * ๐โ * ๐
๐บโโปยน = inv(๐บโ)
๐โ = @SMatrix zeros(Float64,9,9)
for k = 1:l
๐โ = ๐โ[:,k]
โ๐โ๐บโ = (๐โ โ ๐โ') * ๐โ * (๐โ โ ๐) + (๐โ โ ๐') * ๐โ * (๐โ โ ๐โ)
# Accumulating the result in ๐โ allocates memory, even though
# the two terms in the summation are both SArrays.
๐โ = ๐โ + ๐โ * ๐บโโปยน * (โ๐โ๐บโ) * ๐บโโปยน * ๐โ' * ๐ * ๐โ'
end
๐ = ๐ + ๐โ
end
๐
end
# Some sample data
N = 300
โณ = [@SVector rand(2) for i = 1:N]
โณสน = [@SVector rand(2) for i = 1:N]
ฮโ = [SMatrix{3,3}(Matrix(Diagonal([1.0,1.0,0.0]))) for i = 1:length(โณ)]
ฮโ = [SMatrix{3,3}(Matrix(Diagonal([1.0,1.0,0.0]))) for i = 1:length(โณ)]
F = @SMatrix rand(3,3)
๐ = (ฮโ,ฮโ)
๐ = (โณ, โณสน)
T(vec(F),๐,๐)
@btime T(vec($F),$๐,$๐) # 682.152 ฮผs (6002 allocations: 3.85 MiB
I tried @code_warntype T(vec(F),๐,๐)
which produced an output too long to list here. However, a particular line jumped out with Any
:
1230 โ %5524 = invoke Base.afoldl(%5506::typeof(*), %5517::SArray{Tuple{9,1},Float64,2,9}, %5475::Adjoint{Float64,SArray{Tuple{9},Float64,1,9}}, _2::SArray{Tuple{9},Float64,1,9}, %5476::Adjoint{Float64,SArray{Tuple{9},Float64,1,9}})::Any
I donโt know how to proceed from here and would appreciate any advice. In particular, I am not sure whether my โfixโ is missing something, or whether I have stumbled upon a different โbugโ.