Unless you have a connection between each update, as @Oscar_Smith suggested, the only thing you can optimize is the actual solution.
- Use the fastest solver. In case you are on
x64
and the arrays are dense, try usingMKL.jl
. On Apple SiliconAppleAccelerate.jl
can do some wonders as well. - Choose manually the optimized decomposition for your matrix by doing the decomposition steps manually. For inspiration, have a look at MATLAB’s
mldivide()
. - Reduce allocations of the decomposition and solution by using
FastLapackInterface.jl
.
It won’t reduce much compared to the actual factorization and solving, but it is a starting point.
Even if the connection in structure of \boldsymbol{A} between iterations can not be exploited for iterative decomposition, if the solution of the previous iteration is similar to the next one, then you may consider iterative method with warmup.
Another idea, in case your system is guaranteed to have a good condition number, is to take advantage of the small number of columns relative to the number of rows and solve the Normal Equations. Solving \boldsymbol{C} \boldsymbol{x} = \boldsymbol{d} where \boldsymbol{C} = \boldsymbol{A}^{T} \boldsymbol{A} and \boldsymbol{d} = \boldsymbol{A}^{T} \boldsymbol{b}.
Now the system is a square system with 100^2 rows.
You will be able to use Cholesky (Pivoting, no check, See Cholesky Decomposition of Low Rank Positive Semidefinite Matrix) for decomposition (As \boldsymbol{C} is SPSD to the least) and it will be as fast as you can get. Still use 1, 3 from above (2 is covered by Cholesky).
It will square your condition number, so use it wisely.