I found a way without the need of a temporary storage:
function rdiv_lu!(A, B)
B_lu = lu!(B)
rdiv!(rdiv!(A, UpperTriangular(B_lu.factors)), UnitLowerTriangular(B_lu.factors))
_apply_inverse_ipiv_cols!(B_lu, A)
end
The function _apply_inverse_ipiv_cols!
is the equivilant of _apply_inverse_ipiv
but applied to columns instead of rows.
Benchmarks and correctnes:
julia> A,B = 1:2 .|>_-> rand(100, 100);
julia> A / B ≈ rdiv_lu!(A, B)
true
julia> @benchmark A / B setup = A,B = copy($A),copy($B)
BenchmarkTools.Trial:
memory estimate: 235.16 KiB
allocs estimate: 10
--------------
minimum time: 279.562 μs (0.00% GC)
median time: 284.536 μs (0.00% GC)
mean time: 316.165 μs (1.47% GC)
maximum time: 19.354 ms (0.00% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark rdiv_lu!(A, B) setup = A,B = copy($A),copy($B)
BenchmarkTools.Trial:
memory estimate: 528 bytes
allocs estimate: 2
--------------
minimum time: 243.370 μs (0.00% GC)
median time: 262.753 μs (0.00% GC)
mean time: 279.792 μs (0.00% GC)
maximum time: 14.217 ms (0.00% GC)
--------------
samples: 10000
evals/sample: 1
I implemented a rdiv!
function which accepts a LU factorization object and made a pull request: Let rdiv! accept a LU factorization object by zsoerenm · Pull Request #31285 · JuliaLang/julia · GitHub You will find the definition of _apply_inverse_ipiv_cols!
there.