Use QR factorization efficiently


#1

Hey there,

I wanted to solve an overdetermined system Ax=b where A \in \mathbb{R}^{m \times n} and b \in \mathbb{R}^{m} with m < n using the QR decomposition. Since this in a very performance critical part I wanted to do this as memory efficient as possible. Does anybody has any experience how to do it?


#2

qrfact!? Then A_ldiv_B!?


#3

I figured it out, but it is slightly confusing.

A = rand(Complex128, 45, 4)
b = rand(Complex128, 45)
c = rand(Complex128, 4)
d = A \ b # d is a 4-element vector

QR = qrfact!(A, b)
A_ldiv_B!(c, QR, b) # index out of bounds errors
A_ldiv_B!(QR, b) # works, the solutions is in the first 4 entries

I wonder though, why the A_ldiv_B!(c, QR, b) does not work.


#4

Looks like https://github.com/JuliaLang/julia/issues/22560? Not sure if it has been fixed in 0.7 or not.


#5

First alternative,

x = A \ b # Uses the QR method

Second alternative

Q, R = qr(A) # Compute the QR factorization to use the reduce form
x = inv(factorize(R)) * Q.'b # Notice that R is upper triangular

Those are the two ways you can use Base.LinAlg.QRCompactWY.

A more efficient method in general is Cholesky decomposition. A more precise method usually is singular values decomposition. Here are a few options which compute the information matrix as well (critical component for variance covariance estimates).

function solve(linearpredictor::AbstractMatrix, response::AbstractVector,
               method::Type{Base.LinAlg.QRCompactWY})
    Q, R = qr(linearpredictor)
    β = inv(R) * Q.'response
    IM = inv(factorize(R)) * inv(factorize(R.'))
    β, IM
end
# Similar to using x = pinv(A) * b
function solve(linearpredictor::AbstractMatrix, response::AbstractVector, 
               method::Type{Base.LinAlg.SVD}) 
    SVD = svdfact(linearpredictor)
    β = SVD[:V] * diagm(1 ./ SVD[:S]) * SVD[:U].'response
    IM = SVD[:V] * diagm(1 ./ SVD[:S].^2) * SVD[:Vt]
    β, IM
end
function solve(linearpredictor::AbstractMatrix, response::AbstractVector, 
               method::Type{Base.LinAlg.Cholesky})
    IM = inv(cholfact!(linearpredictor.'linearpredictor))
    β = IM * linearpredictor.'response
    β, IM
end
function solve(linearpredictor::AbstractMatrix, response::AbstractVector, 
               method::Type{Base.LinAlg.BunchKaufman})
    IM = inv(bkfact(linearpredictor.'linearpredictor))
    β = IM * linearpredictor.'response
    β, IM
end

As bonus, I am gonna leave here a quick way to make the system full rank by dropping linearly dependently predictors efficiently for overdetermined systems (more efficient than looking at non zero elements of the R in the QR decomposition).

function linearindependent(obj::AbstractMatrix{T}) where T <: Real
    cf = cholfact!(Symmetric(obj.'obj, :U), Val{true}, tol = -one(eltype(obj)))
    r = cf.rank
    p = size(obj, 2)
    if r < p
        LI = sort!(cf.piv[1:r])
        obj = obj[:, LI]
    else
        LI = eachindex(cf.piv)
    end
    return obj, LI
end

#6

I doubt any method involving qr, inv or multiplying the transpose of a matrix by the original matrix in v0.6 can beat a simple inplace qrfact! and inplace solve with A_ldiv_B! like Chris’s solution. A benchmarking is probably due if you want to claim these are more efficient.


#7

Not really. It requires that the matrix is positive definite, so “in general” doesn’t apply. In the case where your matrix is positive definite? Yes it’s more efficient.


#8

LDL’ drops this requirement for almost no additional cost.


#9

The question is specific for overdetermined system Ax=b for which the normal matrix will always be positive definite if A is full rank (hence the linearindependent function, but most likely the A would be full rank in those applications).


#10

The added benefit of using the “long-hand” QR factorization is for obtaining the information matrix as well (as opposed to x = A \ b). If one only needs the parameter estimates and nothing else, a more efficient method might be an iterative solver such as LSMR / LSQR both available in Julia 0.6. As far as I recall in benchmarking the qrfact! and A_ldivB!, those don’t add any or very little efficiency (both call BLAS routines if memory serves right). The in-place does have an advantage if one does iterative solutions that keep overwriting the vector (IRLS routines for example).


#11

I see, I thought you were saying to apply it directly.


#12

If A does not have a full column rank, then the normal matrix will not be positive definite. Also, for a dense matrix A, finding the normal matrix itself costs you more than the whole QR factorization! After QR factorization, all that’s left is a matrix vector multiplication costing O(mn) assuming A is m \times n, and a back-substitution costing O(n^2). So unless m >> n, working in terms of the normal matrix is not likely to give any benefits, because even a Cholesky factorization costs around n^3/3 flops, let alone the inverse.

For big sparse problems, I agree. For small dense problems, I doubt it. I think the “best” way really depends on the details of the problem, how big are m and n, is A dense or sparse, etc. there is no one cure for all as far as I know.


#13

That is correct. Cholesky is more efficient when m >> n which is the case of overdetermined linear systems and when efficiency matters (mn^{2} >> n^{3} \iff m >> n). I am assuming the efficiency gains are important (and will only be noticeable) for large problems which require m to be orders of magnitude bigger than n.

In practice, n increases considerably in sparse problems such as fixed effects (for which I would recommend the within transformation rather than actually estimating the problem directly for efficiency). Even naive implementations of the method of alternating projections will outperform estimations of the full model, but iterative solvers for sparse problems can be an option. For large m and small n Cholesky should work well.

In summary, SVD and QR complexity is tied to m, but Cholesky is mostly influenced by n alone. If m is much bigger than n and you need / care / want efficiency go Cholesky. Otherwise, QR is good, but for really ill-posed problems SVD might be the only option of the three.