Suppose I want to solve the least square problem, I can just do `x = A \ b`

where `A`

is a ‘tall’ matrix. Is there an in-place version for this? If I’m not mistaken it just performs a QR factorization behind the scene to solve for `x`

, but I’m not too sure how to call those functions optimally to have the least memory allocation. I don’t mind changing the entries of `A`

, and I can supply some cache matrices to reduce allocation.

you probably should be using LinearSolve.jl for this.

Among the iterative solver backends offered by LinearSolve.jl, I know Krylov.jl has an in-place version, not sure about the others

The typical size for `A`

in my case is only `size(A)=(128,4)`

. I’m not sure if this is considered ‘large’ enough to use iterative solvers. I just need to solve `A \ b`

many times as part of an optimization problem where each time the entries of `A`

change, and right now I’m getting something like 98% time spent on gc, so I suspect using some sort of in-place version of `A \ b`

might solve the memory allocation problem.

Have you tried StaticArrays? I think this may be small enough. I don’t know whether they implement QR decomposition, but try it if you have not yet.

Unfortunately, solving `A \ b`

for many different `A`

mostly requires refactoring `A`

every time, which is the computationally expensive part. The exception is when the factorization can be efficiently updated, for example low-rank updates to a Cholesky factorization.

But you can still save some memory allocations. Since you say GC time is significant, the savings can be significant too. You want to use some in-place methods like the following:

```
using LinearAlgebra, Random
# initialize
b = Vector{Float64}(undef,128) # initialize memory for b
A = Matrix{Float64}(undef,(4,128)) # initialize memory for A
# loop
for <something>
# compute input vector and matrix
randn!(b) # overwrite b with randn values
randn!(A) # overwrite A with randn values
# solve A \ b
Aqr = qr!(A, ColumnNorm()) # factorize A, with permission to overwrite the memory of A to reduce memory allocations
ldiv!(Aqr, b) # overwrite b with A \ b
end
```

The `!`

at the end of these function names signifies that they are permitted/intended to mangle one or more of the inputs.

Replace my `randn!`

calls with your actual computations of `A`

and `b`

. Ideally, compute these in recycled memory to avoid additional allocations. Look into other `LinearAlgebra`

mutating functions (like `mul!`

) to help with that part, and also in-place broadcasting with `.=`

assignments.

`A \ b`

does the same steps of factorizing `A`

and then solving the system with the factorization. But here we take some manual control so that we allocate less in the process. The `qr!`

allows us to mangle the input values in `A`

so we allocate less memory for the factorization. *Do not expect A to hold its original values afterward. Changing A will change Aqr, making it incorrect, so don’t change A until you’re done using Aqr.* The

`ldiv!`

allows us to overwrite `b`

with the result to save an additional memory allocation. `b`

will have its values changed, but `Aqr`

can still be used for additional solves if you have more vectors to solve against.