Something is suboptimal here, because the “regular inverse” is also LU based — it starts with the LU factorization, and then does a solve.
(For one thing, in the case where a
and b
overlap, you are computing the corresponding rows twice in your two rdiv!
calls. Ideally you should take the union of a
and b
and then do a single rdiv!
call.)
For example, on my machine, this has a speed almost indistinguishable from inv(A)
for A = rand(1000,1000)
:
function inv2(A)
LU = lu(A)
return ldiv!(LU, Matrix{eltype(LU)}(I, size(A)...))
end
but using rdiv!
is actually slightly faster much slower (sorry, I misread the units) than inv(A)
for some reason:
function inv3(A)
LU = lu(A)
return rdiv!(Matrix{eltype(LU)}(I, size(A)...), LU)
end
via:
julia> A = rand(1000,1000); @btime inv($A); @btime inv2($A); @btime inv3($A);
10.791 ms (5 allocations: 8.13 MiB)
10.796 ms (5 allocations: 15.27 MiB)
912.368 ms (5 allocations: 15.27 MiB)
julia> inv(A) ≈ inv2(A) ≈ inv3(A)
true
So it seems like there is a performance bug in the rdiv!
implementation: rdiv! on LU object is much slower than ldiv! · Issue #55422 · JuliaLang/julia · GitHub
Workaround: for now, compute lu(I - X')
and use ldiv!
?