# How to do in-place least square `\`?

Suppose I want to solve the least square problem, I can just do `x = A \ b` where `A` is a ‘tall’ matrix. Is there an in-place version for this? If I’m not mistaken it just performs a QR factorization behind the scene to solve for `x`, but I’m not too sure how to call those functions optimally to have the least memory allocation. I don’t mind changing the entries of `A`, and I can supply some cache matrices to reduce allocation.

you probably should be using LinearSolve.jl for this.

Among the iterative solver backends offered by LinearSolve.jl, I know Krylov.jl has an in-place version, not sure about the others

1 Like

The typical size for `A` in my case is only `size(A)=(128,4)`. I’m not sure if this is considered ‘large’ enough to use iterative solvers. I just need to solve `A \ b` many times as part of an optimization problem where each time the entries of `A` change, and right now I’m getting something like 98% time spent on gc, so I suspect using some sort of in-place version of `A \ b` might solve the memory allocation problem.

Have you tried StaticArrays? I think this may be small enough. I don’t know whether they implement QR decomposition, but try it if you have not yet.

Unfortunately, solving `A \ b` for many different `A` mostly requires refactoring `A` every time, which is the computationally expensive part. The exception is when the factorization can be efficiently updated, for example low-rank updates to a Cholesky factorization.

But you can still save some memory allocations. Since you say GC time is significant, the savings can be significant too. You want to use some in-place methods like the following:

``````using LinearAlgebra, Random

# initialize
b = Vector{Float64}(undef,128) # initialize memory for b
A = Matrix{Float64}(undef,(4,128)) # initialize memory for A

# loop
for <something>
# compute input vector and matrix
randn!(b) # overwrite b with randn values
randn!(A) # overwrite A with randn values

# solve A \ b
Aqr = qr!(A, ColumnNorm()) # factorize A, with permission to overwrite the memory of A to reduce memory allocations
ldiv!(Aqr, b) # overwrite b with A \ b
end
``````

The `!` at the end of these function names signifies that they are permitted/intended to mangle one or more of the inputs.

Replace my `randn!` calls with your actual computations of `A` and `b`. Ideally, compute these in recycled memory to avoid additional allocations. Look into other `LinearAlgebra` mutating functions (like `mul!`) to help with that part, and also in-place broadcasting with `.=` assignments.

`A \ b` does the same steps of factorizing `A` and then solving the system with the factorization. But here we take some manual control so that we allocate less in the process. The `qr!` allows us to mangle the input values in `A` so we allocate less memory for the factorization. Do not expect `A` to hold its original values afterward. Changing `A` will change `Aqr`, making it incorrect, so don’t change `A` until you’re done using `Aqr`. The `ldiv!` allows us to overwrite `b` with the result to save an additional memory allocation. `b` will have its values changed, but `Aqr` can still be used for additional solves if you have more vectors to solve against.

2 Likes