# Best way to repeatedly multiply a Vector by a SparseMatrix

Hi, in my code I need to evaluate A^n * v where A is a possibly very large sparse matrix and v is a vector.

I suppose it is much faster to perform `A * (A * ... * (A * v)` rather than elevating A to the n-th power, but what is the most efficient way to do so?

I currently have

``````function A_pow_n_B!(res, A, n::Integer, v)
tmp = similar(v)
mul!(res, A, v)
for i in 1:n-1
mul!(tmp, A, res)
copyto!(res, tmp)
end
end
``````

Can this code be further improved?

I guess so, as the `copyto!` isnâ€™t necessary in every iteration. You could alternate between `mul!(tmp, A, res)` and `mul!(res, A, tmp)` and return `tmp` or `res` in the end depending on which holds the result.

2 Likes

What you say makes a lot of sense.
But I tried to do as you suggested but the time difference seems negligible (most of the time is spent doing `mul!`: in my use case n \sim 10)

``````function new_A_pow_n_B!(res, x, n::Integer, y)
tmp = similar(y)
if isodd(n)
mul!(res, x, y)
for i in 2:2:n
mul!(tmp, x, res)
mul!(res, x, tmp)
end
else
mul!(tmp, x, y)
for i in 2:2:n-1
mul!(res, x, tmp)
mul!(tmp, x, res)
end
mul!(res, x, tmp)
end
end
``````

I get

``````d = 5000
r = zeros(d)

@benchmark new_A_pow_n_B!(\$(r), \$(sprand(ComplexF64, d, d, .01)), 9, \$(rand(ComplexF64, d)))
BenchmarkTools.Trial:
memory estimate:  78.20 KiB
allocs estimate:  2
--------------
minimum time:     3.181 ms (0.00% GC)
median time:      3.551 ms (0.00% GC)
mean time:        3.696 ms (0.05% GC)
maximum time:     7.348 ms (0.00% GC)
--------------
samples:          1349
evals/sample:     1

@benchmark A_pow_n_B!(\$(r), \$(sprand(ComplexF64, d, d, .01)), 9, \$(rand(ComplexF64, d)))
BenchmarkTools.Trial:
memory estimate:  78.20 KiB
allocs estimate:  2
--------------
minimum time:     3.254 ms (0.00% GC)
median time:      3.716 ms (0.00% GC)
mean time:        3.947 ms (1.17% GC)
maximum time:     53.210 ms (87.77% GC)
--------------
samples:          1264
evals/sample:     1
``````

FWIW, the gain should depend on `d` though. So for very large `d` â€¦

Singular value decomposition can be used to calculate A^n in one step. Depending on the size of A and n it may be cheaper.

1 Like

â€¦ and have smaller error, also depending on `A` and `n`.

1 Like

There is no requirement on the size, but letâ€™s say it can vary from 100 to 10^6, while n \lesssim 10. The matrix is sparse. Do you think it is worth to consider SVD?

What people call â€śsparseâ€ť is very subjective, so it depends. But perhaps for elements mostly around the diagonal and no tricky conditioning/cancellation issues, I would just go for the algorithm you first posted; as SVD may be unnecessarily complex. There are partial algorithms like

but this may be overkill.

The thing is that I canâ€™t assume anything about the structure of the matrix, or itâ€™s density. But, sticking to the repeated multiplication in the OP, are there other optimizations possible with Julia?

I think your original function is fine, I would just make sure that `v` is not sparse to begin with if I donâ€™t expect sparsity to be preserved. But if you literally canâ€™t assume anything, then it is hard to say more.

You can also optimize multiplication (since thatâ€™s the only thing that matters). Transposing `A` to `B` (in a wrapper function) and then multiplying by `B'` may speed up things a bit, because of contiguity.

Maybe this is dumb, but instead of iterating the entire giant matrix, why not just get the coordinates of the structurally non-zero elements and operate on just those?

`SparseArrays.findnz`