# Best way to repeatedly multiply a Vector by a SparseMatrix

Hi, in my code I need to evaluate A^n * v where A is a possibly very large sparse matrix and v is a vector.

I suppose it is much faster to perform `A * (A * ... * (A * v)` rather than elevating A to the n-th power, but what is the most efficient way to do so?

I currently have

``````function A_pow_n_B!(res, A, n::Integer, v)
tmp = similar(v)
mul!(res, A, v)
for i in 1:n-1
mul!(tmp, A, res)
copyto!(res, tmp)
end
end
``````

Can this code be further improved?

I guess so, as the `copyto!` isn’t necessary in every iteration. You could alternate between `mul!(tmp, A, res)` and `mul!(res, A, tmp)` and return `tmp` or `res` in the end depending on which holds the result.

2 Likes

What you say makes a lot of sense.
But I tried to do as you suggested but the time difference seems negligible (most of the time is spent doing `mul!`: in my use case n \sim 10)

``````function new_A_pow_n_B!(res, x, n::Integer, y)
tmp = similar(y)
if isodd(n)
mul!(res, x, y)
for i in 2:2:n
mul!(tmp, x, res)
mul!(res, x, tmp)
end
else
mul!(tmp, x, y)
for i in 2:2:n-1
mul!(res, x, tmp)
mul!(tmp, x, res)
end
mul!(res, x, tmp)
end
end
``````

I get

``````d = 5000
r = zeros(d)

@benchmark new_A_pow_n_B!(\$(r), \$(sprand(ComplexF64, d, d, .01)), 9, \$(rand(ComplexF64, d)))
BenchmarkTools.Trial:
memory estimate:  78.20 KiB
allocs estimate:  2
--------------
minimum time:     3.181 ms (0.00% GC)
median time:      3.551 ms (0.00% GC)
mean time:        3.696 ms (0.05% GC)
maximum time:     7.348 ms (0.00% GC)
--------------
samples:          1349
evals/sample:     1

@benchmark A_pow_n_B!(\$(r), \$(sprand(ComplexF64, d, d, .01)), 9, \$(rand(ComplexF64, d)))
BenchmarkTools.Trial:
memory estimate:  78.20 KiB
allocs estimate:  2
--------------
minimum time:     3.254 ms (0.00% GC)
median time:      3.716 ms (0.00% GC)
mean time:        3.947 ms (1.17% GC)
maximum time:     53.210 ms (87.77% GC)
--------------
samples:          1264
evals/sample:     1
``````

FWIW, the gain should depend on `d` though. So for very large `d` Singular value decomposition can be used to calculate A^n in one step. Depending on the size of A and n it may be cheaper.

1 Like

… and have smaller error, also depending on `A` and `n`.

1 Like

There is no requirement on the size, but let’s say it can vary from 100 to 10^6, while n \lesssim 10. The matrix is sparse. Do you think it is worth to consider SVD?

What people call “sparse” is very subjective, so it depends. But perhaps for elements mostly around the diagonal and no tricky conditioning/cancellation issues, I would just go for the algorithm you first posted; as SVD may be unnecessarily complex. There are partial algorithms like

but this may be overkill.

The thing is that I can’t assume anything about the structure of the matrix, or it’s density. But, sticking to the repeated multiplication in the OP, are there other optimizations possible with Julia?

I think your original function is fine, I would just make sure that `v` is not sparse to begin with if I don’t expect sparsity to be preserved. But if you literally can’t assume anything, then it is hard to say more.

You can also optimize multiplication (since that’s the only thing that matters). Transposing `A` to `B` (in a wrapper function) and then multiplying by `B'` may speed up things a bit, because of contiguity.

Maybe this is dumb, but instead of iterating the entire giant matrix, why not just get the coordinates of the structurally non-zero elements and operate on just those?

`SparseArrays.findnz`