Hi, I have a setup where I need to repeatedly solve a linear system like A \ B = C every timestep of a system. A is a matrix, B is a vector and both changing each timestep and more importantly A is reused later each timestep for other functions like a matrix multiplication e.g. D = A * C.
What I assume is a naive function might look like:
using LinearAlgebra
using BenchmarkTools
A = rand(Float32,10,10)
A_copy = similar(A)
B = rand(Float32,10)
C = similar(B)
D = similar(B)
function lu_div_mul!(A::Array{Float32,2},A_copy::Array{Float32,2},B::Array{Float32,1},C::Array{Float32,1},D::Array{Float32,1})
copy!(A_copy,A)
LU = lu!(A_copy)
ldiv!(C,LU,B)
mul!(D,A,C)
end
@btime lu_div_mul!($A,$A_copy,$B,$C,$D)
843.077 ns (2 allocations: 144 bytes)
where the two allocations come from the lu factorisation. I would like to avoid these allocations and ideally the unneeded copy of A as in reality the dimensions of A can be rather large. Is there an existing function or method, something like lu!(A_lu,A), where the factorisation A_lu can be pre-allocated and reused, avoiding the matrix copy?