How to do in-place least square `\`?

Suppose I want to solve the least square problem, I can just do x = A \ b where A is a ‘tall’ matrix. Is there an in-place version for this? If I’m not mistaken it just performs a QR factorization behind the scene to solve for x, but I’m not too sure how to call those functions optimally to have the least memory allocation. I don’t mind changing the entries of A, and I can supply some cache matrices to reduce allocation.

you probably should be using LinearSolve.jl for this.

Among the iterative solver backends offered by LinearSolve.jl, I know Krylov.jl has an in-place version, not sure about the others

1 Like

The typical size for A in my case is only size(A)=(128,4). I’m not sure if this is considered ‘large’ enough to use iterative solvers. I just need to solve A \ b many times as part of an optimization problem where each time the entries of A change, and right now I’m getting something like 98% time spent on gc, so I suspect using some sort of in-place version of A \ b might solve the memory allocation problem.

Have you tried StaticArrays? I think this may be small enough. I don’t know whether they implement QR decomposition, but try it if you have not yet.

Unfortunately, solving A \ b for many different A mostly requires refactoring A every time, which is the computationally expensive part. The exception is when the factorization can be efficiently updated, for example low-rank updates to a Cholesky factorization.

But you can still save some memory allocations. Since you say GC time is significant, the savings can be significant too. You want to use some in-place methods like the following:

using LinearAlgebra, Random

# initialize
b = Vector{Float64}(undef,128) # initialize memory for b
A = Matrix{Float64}(undef,(4,128)) # initialize memory for A

# loop
for <something>
  # compute input vector and matrix
  randn!(b) # overwrite b with randn values
  randn!(A) # overwrite A with randn values
  # solve A \ b
  Aqr = qr!(A, ColumnNorm()) # factorize A, with permission to overwrite the memory of A to reduce memory allocations
  ldiv!(Aqr, b) # overwrite b with A \ b

The ! at the end of these function names signifies that they are permitted/intended to mangle one or more of the inputs.

Replace my randn! calls with your actual computations of A and b. Ideally, compute these in recycled memory to avoid additional allocations. Look into other LinearAlgebra mutating functions (like mul!) to help with that part, and also in-place broadcasting with .= assignments.

A \ b does the same steps of factorizing A and then solving the system with the factorization. But here we take some manual control so that we allocate less in the process. The qr! allows us to mangle the input values in A so we allocate less memory for the factorization. Do not expect A to hold its original values afterward. Changing A will change Aqr, making it incorrect, so don’t change A until you’re done using Aqr. The ldiv! allows us to overwrite b with the result to save an additional memory allocation. b will have its values changed, but Aqr can still be used for additional solves if you have more vectors to solve against.