UPDATE WITH MORE COMPLETE EXAMPLE
The advice already provided has been extremely helpful, and I am adding this update to provide greater detail and a more concrete example in case it spurs any other thoughts/advice. The following should provide a bit more clarity as to the specific sparse, banded structure of the system.
I hope to solve a large number of sparse, banded, linear system of the form
M=I - B .*P,
P is a sparse stochastic (i.e. transition) matrix, and
B is a discount vector that is a vector of numbers close to 1 (between .999 and 1). The code below creates an example of such a matrix
M that is ordered in a way to make the bandwidth small. The sparsity structure of
M never changes between needing to solve
M*x=v, but its values do. It occurred to me that the solver being called is likely spending a good bit of time determining the sparsity structure and appropriate algorithm for the system. Since the structure of the system never changes, I’m hoping that I can improve the speed of my solves by doing this part of the work of solving the system only once.
The code below is a bit long since it needs to accurately generate the sparsity structure of
M, however one can simply cut and paste without needing to understand or worry about the lengthy function
For the curious, the code is intended to solve for the present discounted values of future cash flows in an inventory management problem. An inventory manager has a given inventory capacity that can be filled by three different types of products. Products arrive and leave according to rates that depend on the current inventory level. To see that the system solves for the present discounted value of cash flow, observe that if
v is a vector representing the cash flow received at any inventory level that
x=v + B.*P*v .+ (B.*P)^2*v .+ (B.*P)^3*v .+ ..., the present-discounted cash flow.
using SparseArrays, LinearAlgebra, BenchmarkTools const Φ=3 function get_M(capacity) function add_index!(w,ϕ,ww_test,ww_fill,Vstates,plus_inds,minus_inds) w[ϕ]+=1 for ww=(ww_test+1):(ww_fill-1) if Vstates[ww]==w w[ϕ]-=1 plus_inds[ww_test,ϕ] = ww minus_inds[ww,ϕ] = ww_test return ww_fill end end Vstates[ww_fill].= w w[ϕ]-=1 plus_inds[ww_test,ϕ] = ww_fill minus_inds[ww_fill,ϕ] = ww_test return ww_fill+1 end function get_state_order(capacity::Int64) num_states = binomial(capacity+Φ,Φ) Vstates=[zeros(Int64,Φ) for bb=1:num_states] plus_inds = zeros(Int64,num_states,Φ) minus_inds = zeros(Int64,num_states,Φ) ww_test=1 ww_fill=2 while ww_test<=num_states w=Vstates[ww_test] if (sum(w)<capacity) for ϕ=1:Φ ww_fill=add_index!(w,ϕ,ww_test,ww_fill,Vstates,plus_inds,minus_inds) end end ww_test +=1 end return Vstates, plus_inds, minus_inds end function makeM(plus_inds,minus_inds,B,P_plus,P_minus) N_w = size(plus_inds,1) I_A = repeat(collect(1:N_w),Φ) J_A = vec(plus_inds) V_A = vec(B .* P_plus) I_D = repeat(collect(1:N_w),Φ) J_D = vec(minus_inds) V_D = vec(B .* P_minus) Is = vcat(I_A,I_D) Js = vcat(J_A,J_D) Vs = vcat(V_A,V_D) has_J = (Js .!= 0) Is=Is[has_J] Js=Js[has_J] Vs=Vs[has_J] #Actual premult matrix. BxP = sparse(Is,Js,Vs) M = sparse(I,N_w,N_w) .- (BxP) return M end #Get the states and the adjacency map. Vstates, plus_inds, minus_inds = get_state_order(capacity) #Create transition probabilities. P_plus = rand(Float64,size(plus_inds)) P_minus = rand(Float64,size(plus_inds)) #Ensure that transition probabilities are zero when at capacity or empty. P_plus[plus_inds.==0] .= 0 P_minus[minus_inds.==0] .= 0 #Normalize transition probabilities to sum to 1. P_norm = sum(P_plus,dims=2) .+ sum(P_minus,dims=2) P_plus = P_plus ./ (sum(P_plus,dims=2) .+ sum(P_minus,dims=2)) discount_rate = .999 .+ rand(Float64,size(P_plus,1)) ./1000. return makeM(plus_inds,minus_inds,discount_rate,P_plus,P_minus) end capacity = 40 @time M = get_M(capacity) v = randn(size(M,1)) @btime $M\$v
ORIGINAL POST BELOW
I have a large, very sparse banded matrix M. I need to compute
M\v many times with slight changes to the values (but not the sparsity structure) of
M. It occurs to me that much of the time spent in
M\v or other solvers is spent determining and exploiting the sparsity structure. I wonder whether there is a way not to repeat that work. In my case, the code below produces times that are similar for “solve once” and “solve twice simultaneously” but twice as long for “solve twice.” This suggests to me that if there were a way to do the first part of the solver’s work in exploiting the sparsity structure just once that it would improve my computation time by an order of magnitude or more. Does anyone know whether there is a way to do this (without writing my own linear systems solver in C,C++,or Fortran)? Thank you!
#Time 1:Solve once. v1 = rand(size(M,1)) @btime $M\$v1 #Time 2:Solve twice sequentially. function tf(M,v) out = M\v out .= M\v return out end @btime tf($M,$v1) #Time 3: Solve twice simultaneously. v2 = rand(size(M,1),2) @btime $M\$v2