Which algorithm does Julia uses for QR decomposition?
Additionally, parallelizing decomposition methods is non-trivial. Does it use a specialized implementation on the GPU?
Which algorithm does Julia uses for QR decomposition?
Additionally, parallelizing decomposition methods is non-trivial. Does it use a specialized implementation on the GPU?
Julia uses multiple dispatch, so qr(A)
depends on the type of A
. For generic dense matrices, it uses Householder QR (via LAPACK’s *geqrf
), and there is a pivot
argument to use a column-pivoted variant (via LAPACK’s *geqp3
).
The standard library can use CPU threads, but does nothing with the GPU — you use a GPU with GPUArrays.jl or similar. You can also use distributed memory with Elemental.jl, and so on. I don’t know offhand if those packages include parallel QR functions.
Julia uses multiple dispatch, so
qr(A)
depends on the type ofA
Could you point me to the source code where it solves for specific cases? Also, does above cover all cases, you think?
on macOS, is it using Apple’s provided LAPACK or does it ship its own? The apple owned LAPACK is accessed via Accelerate framework I believe. Located at: /Accelerate.framework/Frameworks/vecLib.framework/Headers/lapack.h
You can find the source by running
@edit qr(rand(100, 100))
In this particular case, you’ll see that it’s actually calling qr!
(the in-place version), but it’s in the same file.
If the call stack becomes more complicated, you can pretty easily “descend” down the dispatch stack using Cthulhu.jl:
using Cthulhu, LinearAlgebra
@descend qr(rand(100, 100))
When doing the same for a CUDA array, @descend
will eventually take you to a cuSOLVER call (i.e. Nvidia’s cuBLAS). You can find what algorithm they use here.
It defaults to using the LAPACK from OpenBLAS, which is bundled with Julia. It can optionally use Apple Accelerate via the Accelerate.jl package (or Intel MKL via the MKL.jl package) — you just type using Accelerate
and the same calls like qr(...)
will be internally redirected transparently to Accelerate’s implementation (if it exists), thanks to a magical bit of infrastructure called libblastrampoline.
Yes. (There is also the LQ factorization.) For a matrix that might be rank deficient, using the pivoted variant qr(A, ColumnNorm())
is usually a good idea.